Line Parsing Errors and Skipping

2015-11-03 Thread John Omernik
I am doing some "active" loading of data into json files on MapRFS. Basically I have feeds pulling from a message queue and outputting the JSON messages. I have a query that is doing aggregations on all the data that seem to work 90% of the time. The other 10%, I get this error: Error: DATA_REA

Re: Line Parsing Errors and Skipping

2015-11-03 Thread mark charts
Hi. I read your dilemma. Would a trap in program to handle this ERROR or Exception work for you in this case and address it by skip around the trouble? My guess is you have a timing condition gone astray somewhere and you need to assure all states are timed correctly. But what do I know. Good lu

Re: Line Parsing Errors and Skipping

2015-11-03 Thread Andries Engelbrecht
See DRILL-2424 and DRILL-1131 Incomplete records/files can cause issues, in Drill 1.2 hey have added the ability to ignore data files with a .prefix. Perhaps copy files in over NFS using a . prefix and then rename once copied on the DFS. I had the same issue with Flume data streaming in and inc

Re: Line Parsing Errors and Skipping

2015-11-03 Thread John Omernik
Well I have one program writing data via Python to MapRFS in a directory that Drill is reading, so yes, I have two different programs reading and writing data. What I am looking for here is knowing I may have this scenario where a read may occur before a write is complete, can I just have Drill ig

Re: Line Parsing Errors and Skipping

2015-11-03 Thread John Omernik
Great feature and this fixes my problem. All I do is in my python script when I open a file, it opens with the .prefix. When I "close" it I rename it without the . prefix. Easy fix. Thanks for the pointer Andries! John On Tue, Nov 3, 2015 at 1:52 PM, Andries Engelbrecht < aengelbre...@maprtech.c