[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15602663#comment-15602663
 ] 

Ian Maxon commented on ASTERIXDB-1699:
--------------------------------------

The main issue with figuring out what's going on here so far has been the 
inability to get a debugger on the instance when search is happening. Maybe 
once the alternate cluster is done loading it may be easier to get that done 
without interrupting things too much. 

> Inverted Index fail to match the keyword
> ----------------------------------------
>
>                 Key: ASTERIXDB-1699
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1699
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: Storage
>         Environment: master : 4819ea44723b87a68406d248782861cf6e5d3305
>            Reporter: Jianfeng Jia
>            Assignee: Ian Maxon
>
> Not very clear how to reproduce it on a smaller dataset. Here is the symptom: 
> If I run the following query
> {code}
> for $t in dataset twitter.ds_tweet
> where $t.'create_at' >= datetime('2016-10-19T00:00:47.473Z') and 
> $t.'create_at' < datetime('2016-10-19T00:01:47.473Z') 
> and  /* +skip-index */ similarity-jaccard(word-tokens($t.'text'), 
> word-tokens('sleep')) > 0.0
> return $t.text
> {code}
> It will return some results
> {code}
> "No point in going to sleep now lol"
> "Can't sleep"
> "TL Sleep ��"
> "i can't sleep man����"
> "Blazed and I still can't sleep fackkkk.."
> "When you're proud of yourself for going to bed in time to get 6 hours of 
> sleep #CollegeLyfeAmIRightIAmIt'sSoCrazyLol"
> "I would be sleep rn but have to lurk bc I'm no sucka & bc the fan isn't 
> working��"
> "Since I can't sleep �� https://t.co/ALZE4psIqP";
> "Wish I Could Sleep"
> "Of course when I go to lay down finally, I am not tired. To sleep or not to 
> sleep?? That's the real question."
> {code}
> If I'm using index
> {code}
> for $t in dataset twitter.ds_tweet
> where $t.'create_at' >= datetime('2016-10-19T00:00:47.473Z') and 
> $t.'create_at' < datetime('2016-10-19T00:01:47.473Z') 
> and  similarity-jaccard(word-tokens($t.'text'), word-tokens('sleep')) > 0.0
> return $t.text
> {code}
> It returns empty. 
> The debug port is on 8001 on each cloudberry nuc nc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to