subject:"Having trouble indexing nested docs using \"split\" feature."

Re: Having trouble indexing nested docs using "split" feature.

2017-12-02 Thread Shawn Heisey

On 12/2/2017 12:55 PM, David Lee wrote:

{
"responseHeader":{
"status":0,
"QTime":798}}

Though the status indicates there was no error, when I try to query on
the the data using *:*, I get this:

curl 'http://localhost:8983/solr/my_collection/select?q=*:*'
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":6,
"params":{
"q":"*:*"}},
"response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
}}

So it looks like no documents were actually indexed from above. I'm
trying to determine if this is due to an error in the reference manual,
or if I haven't set up Solr correctly.

I don't know anything at all about the split feature or the parent/child
document feature. I'm going to concentrate on the fact that numFound is
zero. With the indexing returning a success response, there should have
been SOMETHING indexed.

Did you ever do a commit operation? This can be an explicit operation,
or there are some ways you can have it happen automatically. If you
include a commitWithin parameter on the indexing request, then there
will be an automatic commit within that many milliseconds from when
indexing started. You can configure autoSoftCommit in solrconfig.xml,
then reload the core/collection or restart Solr.

Unless there is a commit that opens a new searcher, changes made to the
index will never be visible to clients.

https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

The article title says "SolrCloud" but all the information is just as
applicable to standalone mode.

If you *have* done a commit with openSearcher set to true (which is the
default setting for openSearcher), then we'll need to examine solr.log,
and you'll need to be sure that the indexing request happened during the
time the log was created.

Thanks,
Shawn

Re: Having trouble indexing nested docs using "split" feature.

2017-12-02 Thread David Lee


Sorry about the formatting for the first part, hope this is clearer:

{
    "book_id": "1234",
    "book_title": "The Martian Chronicles",
"author": "Ray Bradbury",
"reviews": [
    {
"reviewer": "John Smith",
    "reviewer_background": {
"highest_rank": "Excellent",
    "latest_review": "10/15/2017 10:15:00.000 CST",
    }
    }, {
"reviewer": "Adam Smith",
    "reviewer_background": {
"highest_rank": "Good",
    "latest_review": "10/10/2017 16:18:00.000 CST",
}
}
],
"checkouts": [
{
"member_id": "aaabbbccc",
    "member_name": "Sam Jackson"
},{
"member_id": "bbbcccddd",
    "member_name": "Buddy Jones"
    }
    ]
}


On 12/2/2017 1:55 PM, David Lee wrote:

Hi all,

I've been trying for some time now to find a suitable way to deal with 
json documents that have nested data. By suitable, I mean being able 
to index them and retrieve them so that they are in the same structure 
as when indexed.


I'm using version 7.1 under linux Mint 18.3 with Oracle Java 
1.8.0_151. After untarring the distribution, I ran through the 
"getting started" tutorial from the reference manual where it had me 
create the techproducts index. I then created another collection 
called my_collection so I could run the examples more easily. It used 
the _default schema.


Here is a sample:

{

    "book_id": "1234",     "book_title": "The Martian Chronicles",     
"author": "Ray Bradbury", "reviews": [     { "reviewer": "John 
Smith",     "reviewer_background": {     
"highest_rank": "Excellent", "latest_review": "10/15/2017 10:15:00.000 
CST",     }     }, {     "reviewer": "Adam Smith", 
"reviewer_background": {     "highest_rank": "Good", 
    "latest_review": "10/10/2017 16:18:00.000 CST",     } 
    } ], "checkouts": [ { "member_id": "aaabbbccc", "member_name": 
"Sam Jackson" },{ "member_id": "bbbcccddd",   "member_name": 
"Buddy Jones"   }   ] }


Obviously, I'll need to search at the parent level and child level. I 
started experimenting and tried to use one of the examples from 
"Transforming and Indexing Solr JSON". However, when I tried the first 
example as follows:


curl 'http://localhost:8983/solr/my_collection/update/json/docs'\

'?split=/exams'\
'=first:/first'\
'=last:/last'\
'=grade:/grade'\
'=subject:/exams/subject'\
'=test:/exams/test'\
'=marks:/exams/marks'\
  -H 'Content-type:application/json' -d '
{
   "first": "John",
   "last": "Doe",
   "grade": 8,
   "exams": [
 {
   "subject": "Maths",
   "test"   : "term1",
   "marks"  : 90},
 {
   "subject": "Biology",
   "test"   : "term1",
   "marks"  : 86}
   ]
}'

{
  "responseHeader":{
    "status":0,
    "QTime":798}}

Though the status indicates there was no error, when I try to query on 
the the data using *:*, I get this:


curl 'http://localhost:8983/solr/my_collection/select?q=*:*'
{
  "responseHeader":{
    "zkConnected":true,
    "status":0,
    "QTime":6,
    "params":{
  "q":"*:*"}},
  "response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
  }}

So it looks like no documents were actually indexed from above. I'm 
trying to determine if this is due to an error in the reference 
manual, or if I haven't set up Solr correctly.


I've tried other techniques (not using the split option) like from 
Yonik's site, but those are slightly dated and I was hoping there was 
a more practical approach with the release of Solr 7.


Any assistance would be appreciated.

Thank you.

Having trouble indexing nested docs using "split" feature.

2017-12-02 Thread David Lee


Hi all,

I've been trying for some time now to find a suitable way to deal with 
json documents that have nested data. By suitable, I mean being able to 
index them and retrieve them so that they are in the same structure as 
when indexed.


I'm using version 7.1 under linux Mint 18.3 with Oracle Java 1.8.0_151. 
After untarring the distribution, I ran through the "getting started" 
tutorial from the reference manual where it had me create the 
techproducts index. I then created another collection called 
my_collection so I could run the examples more easily. It used the 
_default schema.


Here is a sample:

{

    "book_id": "1234",     "book_title": "The Martian Chronicles",     
"author": "Ray Bradbury", "reviews": [     {     "reviewer": 
"John Smith",     "reviewer_background": {     
"highest_rank": "Excellent",     "latest_review": 
"10/15/2017 10:15:00.000 CST",     }     }, {     
"reviewer": "Adam Smith",    "reviewer_background": { 
    "highest_rank": "Good",     "latest_review": 
"10/10/2017 16:18:00.000 CST",     }     } ], "checkouts": [ { 
"member_id": "aaabbbccc", "member_name": "Sam Jackson" },{ "member_id": 
"bbbcccddd",   "member_name": "Buddy Jones"   }   ] }


Obviously, I'll need to search at the parent level and child level. I 
started experimenting and tried to use one of the examples from 
"Transforming and Indexing Solr JSON". However, when I tried the first 
example as follows:


curl 'http://localhost:8983/solr/my_collection/update/json/docs'\

'?split=/exams'\
'=first:/first'\
'=last:/last'\
'=grade:/grade'\
'=subject:/exams/subject'\
'=test:/exams/test'\
'=marks:/exams/marks'\
  -H 'Content-type:application/json' -d '
{
   "first": "John",
   "last": "Doe",
   "grade": 8,
   "exams": [
 {
   "subject": "Maths",
   "test"   : "term1",
   "marks"  : 90},
 {
   "subject": "Biology",
   "test"   : "term1",
   "marks"  : 86}
   ]
}'

{
  "responseHeader":{
    "status":0,
    "QTime":798}}

Though the status indicates there was no error, when I try to query on 
the the data using *:*, I get this:


curl 'http://localhost:8983/solr/my_collection/select?q=*:*'
{
  "responseHeader":{
    "zkConnected":true,
    "status":0,
    "QTime":6,
    "params":{
  "q":"*:*"}},
  "response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
  }}

So it looks like no documents were actually indexed from above. I'm 
trying to determine if this is due to an error in the reference manual, 
or if I haven't set up Solr correctly.


I've tried other techniques (not using the split option) like from 
Yonik's site, but those are slightly dated and I was hoping there was a 
more practical approach with the release of Solr 7.


Any assistance would be appreciated.

Thank you.

Re: Having trouble indexing nested docs using "split" feature.

Re: Having trouble indexing nested docs using "split" feature.

Having trouble indexing nested docs using "split" feature.

3 matches

Site Navigation

Mail list logo

Footer information