It's definitely true that we at least intend to filter out nulls. the use of createFilterExpression() in IntroduceSecondaryIndexInsertDeleteRule makes this fairly clear. Unless I am misinterpreting the meaning of a tuple being fed into the RTreeBulkloader (i.e. if I see a tuple there, it's meant to be inserted), It doesn't seem to be applied properly somehow.
On Wed, Dec 2, 2015 at 8:55 AM, Mike Carey <[email protected]> wrote: > Thx! (Stupid Q, I know, but I'd forgotten what we decided there N years > ago... :-)) > > > > On 12/2/15 7:41 AM, Sattam Alsubaiee wrote: >> >> Same for all other indexes. Nulls won't be sent to any index, they are >> always filtered. >> >> Sattam >> On Dec 2, 2015 6:31 PM, "Mike Carey" <[email protected]> wrote: >> >>> And for other types of indexes? (Seems like this would be behavior we'd >>> want at the index level, one way or the other, vs. the detailed kind of >>> index level?) >>> >>> On 12/2/15 5:15 AM, Sattam Alsubaiee wrote: >>> >>>> Nulls won't be sent to the R-tree. They will be filtered and if they are >>>> not filtered, then it is a bug. >>>> >>>> Cheers, >>>> Sattam >>>> >>>> On Wed, Dec 2, 2015 at 5:25 AM, Ian Maxon (JIRA) <[email protected]> >>>> wrote: >>>> >>>> [ >>>>> >>>>> >>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >>>>> ] >>>>> >>>>> Ian Maxon reopened ASTERIXDB-1201: >>>>> ---------------------------------- >>>>> >>>>> I think this should still happen with more data of a similar vein (i.e. >>>>> lots of nulls)? I see in the code actually where this should happen and >>>>> it >>>>> must just not be triggered in the modified bulkload for that particular >>>>> data. The real issue is when we try to calculate the MBR of a null >>>>> shape. >>>>> adjust/calculateMBRImpl just doesn't handle that, it's expecting to see >>>>> something with doubles in a corner of the shape. >>>>> >>>>> RTree built on the optional field refuses to load the NULL value when >>>>> executing the bulk load >>>>> >>>>> >>>>> --------------------------------------------------------------------------------------------- >>>>> >>>>>> Key: ASTERIXDB-1201 >>>>>> URL: >>>>>> >>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1201 >>>>> >>>>>> Project: Apache AsterixDB >>>>>> Issue Type: Bug >>>>>> Components: Storage >>>>>> Reporter: Jianfeng Jia >>>>>> Assignee: Ian Maxon >>>>>> >>>>>> When I build a RTree index on an optional field, it will throw "Value >>>>>> >>>>> provider for type NULL is not implemented" exception when operates the >>>>> bulk >>>>> load. >>>>> >>>>>> Here is the reproducible script: >>>>>> {code} >>>>>> drop dataverse test if exists; >>>>>> create dataverse test; >>>>>> use dataverse test; >>>>>> create type t_record as closed { >>>>>> fa : int64, >>>>>> fb: int64?, >>>>>> fc : point? >>>>>> } >>>>>> create dataset ds_set (t_record) primary key fa; >>>>>> create index bidx on ds_set(fb) type btree; >>>>>> create index cidx on ds_set(fc) type rtree; >>>>>> insert into dataset ds_set ( [{"fa":1}, {"fa":2, "fb":3}, {"fa":3, >>>>>> >>>>> "fc":point("4.0,5.0")}]); >>>>> >>>>>> load dataset ds_set >>>>>> using localfs >>>>>> (("path"="172.17.0.2:///data/twitter/test.adm"),("format"="adm")); >>>>>> {code} >>>>>> The "insert" and "load" statements are run separately. >>>>>> The test.adm uses the same three records: >>>>>> {code} >>>>>> {"fa":1} >>>>>> {"fa":2, "fb":3} >>>>>> {"fa":3, "fc":point("4.0,5.0") >>>>>> {code} >>>>>> The insert statement works fine. The error happens in the "load" >>>>>> >>>>> statement only: >>>>> >>>>>> {code} >>>>>> Caused by: >>>>>> >>>>> >>>>> org.apache.hyracks.algebricks.common.exceptions.NotImplementedException: >>>>> Value provider for type NULL is not implemented >>>>> >>>>>> at >>>>>> >>>>> >>>>> org.apache.asterix.dataflow.data.nontagged.valueproviders.AqlPrimitiveValueProviderFactory$1.getValue(AqlPrimitiveValueProviderFactory.java:64) >>>>> >>>>>> at >>>>>> >>>>> org.apache.hyracks.storage.am >>>>> .rtree.frames.RTreeNSMFrame.adjustMBRImpl(RTreeNSMFrame.java:132) >>>>> >>>>>> at >>>>>> >>>>> org.apache.hyracks.storage.am >>>>> .rtree.frames.RTreeNSMFrame.adjustMBR(RTreeNSMFrame.java:153) >>>>> >>>>>> at >>>>>> >>>>> org.apache.hyracks.storage.am >>>>> .rtree.impls.RTree$RTreeBulkLoader.propagateBulk(RTree.java:954) >>>>> >>>>>> at >>>>>> >>>>> org.apache.hyracks.storage.am >>>>> .rtree.impls.RTree$RTreeBulkLoader.end(RTree.java:937) >>>>> >>>>>> at >>>>>> >>>>> org.apache.hyracks.storage.am >>>>> .lsm.rtree.impls.LSMRTree$LSMRTreeBulkLoader.end(LSMRTree.java:584) >>>>> >>>>>> at >>>>>> >>>>> org.apache.hyracks.storage.am >>>>> >>>>> .common.dataflow.IndexBulkLoadOperatorNodePushable.close(IndexBulkLoadOperatorNodePushable.java:107) >>>>> >>>>>> ... 7 more >>>>>> {code} >>>>>> The BTree index works fine if I remove the RTree index. >>>>>> >>>>> >>>>> -- >>>>> This message was sent by Atlassian JIRA >>>>> (v6.3.4#6332) >>>>> >>>>> >
