[ https://issues.apache.org/jira/browse/ASTERIXDB-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451212#comment-15451212 ]
Wail Alkowaileet commented on ASTERIXDB-1616: --------------------------------------------- The problem occurs with variable length fields. When you project (return) only primitive types at level zero (the dataset type), the issue disappears. I remember I tried to debug it and it seems that the length offset of a string (for example) is off by one byte (which leads to read some random 4-bytes that are way larger than the frame size. Hence, it throws IndexOutOfBoundExcpetion sometimes). However, if you make the schema all open (key-only) this should solve the problem. Which makes me wonder if that has anything to do with the parser? I suspect the problem resides in the ARecordBuilder and friends. I will try to print the record right before flush (and after the parsing) and see if I get the same issue. > NPE when printing record inside open type with unicode fields > ------------------------------------------------------------- > > Key: ASTERIXDB-1616 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1616 > Project: Apache AsterixDB > Issue Type: Bug > Reporter: Ian Maxon > Assignee: Ian Maxon > > DDL: > https://github.com/kevincoakley/asterixdb_tests/blob/master/notebooks/asterixdb-spark/Count%20one_percent%20Tweets%20Spark%20Single.ipynb > Data: > https://object.cloud.sdsc.edu/v1/AUTH_kcoakley/asterixdblogs/2015_11_07_00_onepercent.txt > Basically just a scan+limit on the one_percent dataset will give > IndexOutOfBounds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)