Michael Berman created ACCUMULO-4541:
----------------------------------------

             Summary: shell `grep -o` writes incomplete files
                 Key: ACCUMULO-4541
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4541
             Project: Accumulo
          Issue Type: Bug
          Components: shell
    Affects Versions: 1.7.2
            Reporter: Michael Berman


{{grep -o}} appears to be writing truncated files. missing a flush, maybe? i do 
not observe the same behavior with {{scan -o}}

{code}
$ accumulo shell ... -e 'grep a -t tablename -np -o /tmp/grep_-o.out'
2016-12-21 23:38:29,350 [client.ClientConfiguration] WARN : Found no 
client.conf in default paths. Using default client configuration values.
2016-12-21 23:38:29,358 [client.ClientConfiguration] WARN : Found no 
client.conf in default paths. Using default client configuration values.
2016-12-21 23:38:31,488 [client.ClientConfiguration] WARN : Found no 
client.conf in default paths. Using default client configuration values.
2016-12-21 23:38:31,763 [trace.DistributedTrace] INFO : SpanReceiver 
org.apache.accumulo.tracer.ZooTraceClient was loaded successfully.

$ accumulo shell ... -e 'grep a -t tablename -np' > /tmp/grep_redirect.out
$ wc /tmp/grep_*.out
  44030  194670 4669440 /tmp/grep_-o.out
  44165  195171 4761183 /tmp/grep_redirect.out
{code}

There are a couple extra rows in the redirect output explained by the log 
messages, but not 100. The final line in grep_-o.out is truncated in the middle 
of the row column.

If I include some terms in the grep to limit the output, I often get 
zero-length files via -o despite seeing results when run interactively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to