Found something interesting while helping a user find a file that was bulk
imported with a bad Column Visibility.  I was about to write it off as
"well if you ingest data with a bad CV then you won't be able to get it
back" but it looks like validation was added to prevent bulk import with
invalid CV in ACCUMULO-360[1].  Validation was added to
AccumuloFileOutputFormat but I don't see recent versions using
AccumuloFileOutputFormat
during bulk import.

I did some bulk imports with a CV of "A|B|" using Uno across different
versions:
1.6 - Rfile imported, throws server error on scan
1.7 - Rfile imported, scan only returns rows with valid visibility but does
not throw error
1.9 - Rfile imported, scan only returns rows with valid visibility but does
not throw error

I attached the stacktrace, which only shows up in 1.6.

Has anyone ran into this issue before?  Perhaps this validation was removed
for performance reasons?

[1] https://issues.apache.org/jira/browse/ACCUMULO-360
2018-07-27 16:38:51,700 [system.VisibilityFilter] ERROR: Parse Error
org.apache.accumulo.core.util.BadArgumentException: empty term near index 4
A|B|
    ^
        at 
org.apache.accumulo.core.security.ColumnVisibility$ColumnVisibilityParser.processTerm(ColumnVisibility.java:305)
        at 
org.apache.accumulo.core.security.ColumnVisibility$ColumnVisibilityParser.parse_(ColumnVisibility.java:405)
        at 
org.apache.accumulo.core.security.ColumnVisibility$ColumnVisibilityParser.parse(ColumnVisibility.java:286)
        at 
org.apache.accumulo.core.security.ColumnVisibility.validate(ColumnVisibility.java:421)
        at 
org.apache.accumulo.core.security.ColumnVisibility.<init>(ColumnVisibility.java:466)
        at 
org.apache.accumulo.core.security.ColumnVisibility.<init>(ColumnVisibility.java:455)
        at 
org.apache.accumulo.core.iterators.system.VisibilityFilter.accept(VisibilityFilter.java:73)
        at org.apache.accumulo.core.iterators.Filter.findTop(Filter.java:72)
        at org.apache.accumulo.core.iterators.Filter.next(Filter.java:59)
        at 
org.apache.accumulo.core.iterators.system.SynchronizedIterator.next(SynchronizedIterator.java:51)
        at 
org.apache.accumulo.core.iterators.WrappingIterator.next(WrappingIterator.java:96)
        at 
org.apache.accumulo.core.iterators.user.VersioningIterator.skipRowColumn(VersioningIterator.java:97)
        at 
org.apache.accumulo.core.iterators.user.VersioningIterator.next(VersioningIterator.java:58)
        at 
org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.readNext(SourceSwitchingIterator.java:139)
        at 
org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.next(SourceSwitchingIterator.java:123)
        at org.apache.accumulo.tserver.Tablet.nextBatch(Tablet.java:1707)
        at org.apache.accumulo.tserver.Tablet.access$3200(Tablet.java:177)
        at org.apache.accumulo.tserver.Tablet$Scanner.read(Tablet.java:1838)
        at 
org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$NextBatchTask.run(TabletServer.java:1133)
        at 
org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at 
org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
        at 
org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
        at java.lang.Thread.run(Thread.java:748)

Reply via email to