Удалите строку, которая содержит 0 раз больше, чем 'x'

Question

Удалите строку, которая содержит 0 раз больше, чем 'x'

Если у вас есть провод для подключения к Интернету, вы можете подключиться через командную строку, используя dhclient eth0 в командной строке (вы должны быть пользователем root). После подключения просто yum установите NetworkManager . NetworkManager чувствителен к регистру.

3

text-processing awk sed grep

Rui F Ribeiro 21.11.2018, 00:37

Ссылка

6 ответов

Похожие вопросы

steeldriver · Answer 1 · 27.01.2020, 21:09

Enfoque KISS, conawk

awk -F, '{c = 0; for(i=1; i<=NF; i++) {c += $i == "0" ? 1 : 0}} c <= 3' file.csv
    gene,v1,v2,v3,v4,v5,v6,v7
    gene1,0,1,5,0,0,4,100
    gene2,1,0,0,0,5,210,2

Conperl

perl -F, -ne 'print unless (grep { $_ eq "0" } @F) > 3' file.csv
    gene,v1,v2,v3,v4,v5,v6,v7
    gene1,0,1,5,0,0,4,100
    gene2,1,0,0,0,5,210,2

ilkkachu · Answer 2 · 27.01.2020, 21:09

Con awk -F',0', tres copias de ,0se tomarán como tres separadores , dando cuatro campos en total. Entonces, si usa awk -F',0' 'NF<5 {print}'en su lugar, debería ver las líneas correctas en la salida.

,0también coincidirá con cadenas como 213,0123, que puede o no querer tomar como separadores de cero.

Por lo tanto, también podría usar ,como separador de campo y contar los campos que tienen solo ese cero en ellos:

awk -F, '{z=0; for (i = 1 ; i <= NF ; i++) if ($i == 0) z++} z <= 4' file.csv

Ondřej Xicht Světlík · Answer 3 · 27.01.2020, 21:09

También puede resolver su problema usando expresiones regulares y grep.

grep -Ev '(,0(,[^0,]+)*){4,}' file.csv

Lo probé en este archivo:

gene,v1,v2,v3,v4,v5,v6,v7
gene1,0,1,5,0,0,4,100
gene2,1,0,0,0,5,210,2
gene3,0,0,0,0,6,0,0
gene4,0,0,0,4,6,0,0
gene5,0,1,0,4,6,0,0

Hay algunas suposiciones:

no no -el número cero comienza con un cero,
los números cero contienen solo un cero,
todos los números son enteros.

La expresión regular podría extenderse para abordar tales casos si lo necesita.

steve · Answer 4 · 27.01.2020, 21:09

Seguramente la respuesta es simplemente

awk -F,0 'NF<5' file.csv

Use un delimitador de ",0" y cuando el número de campos sea inferior a 5, realice la acción predeterminada que es imprimir.

Lo probé en este archivo

gene,v1,v2,v3,v4,v5,v6,v7
gene1,0,1,5,0,0,4,100
gene2,1,0,0,0,5,210,2
gene3,0,0,0,0,6,0,0
gene4,0,0,0,4,6,0,0
gene5,0,1,0,4,6,0,0

Lo que arrojó este resultado

gene,v1,v2,v3,v4,v5,v6,v7
gene1,0,1,5,0,0,4,100
gene2,1,0,0,0,5,210,2

¡Pruébelo en línea!

Rakesh Sharma · Answer 5 · 27.01.2020, 21:09

Esto se puede hacer con lo siguiente:

¶ dividir registros en una coma

  perl -F'/,(?=0,|0$)/' -lane 'print if $#F < 4' csv.file 

° split on those commas to the right of whom we see either a 0, or a 0 at the end.

° the array formed by splitting up the record ($_) is (@F) and whose last index ($#) will have how many such commas were there.

¶ basado en sed

 sed -ne '
     h;1b print
     s/,/,,/g;s/$/,/;t reset
     :reset;s/,0,/&/4;t
     :print;g;p
 '  csv.file

°  we double the commas as this involves overlapping matches. Also provide a comma at the end for uniform matching. 
 ° a dummy t command is run first to clear the test flag, OTW the actual t command that follows misfires.
° a s/// command is run to do a fourth substitution. If it succeeds => there are at least four pure zero fields. We don't want this so the labelless t command shall take the conrol to end of any further processing. The -n sed option will prevent it from being printed.
° now when the substitution failed => there were three or less such pure zero fields and we want such lines.
° before making changes we had stored the original unmodified line in hold space so we get it back and print it.

αғsнιη · Answer 6 · 27.01.2020, 21:09

Если все числа целые, то с помощью GNU awk, который поддерживает границы слов \<...\>, вы можете сделать

gawk 'gsub(/\<0\>/, "0") <5' infile

1

αғsнιη 27.01.2020, 21:09

Ссылка

Удалите строку, которая содержит 0 раз больше, чем 'x'

Теги

Похожие вопросы