Sorry about the formatting for the first part, hope this is clearer:
{
"book_id": "1234",
"book_title": "The Martian Chronicles",
"author": "Ray Bradbury",
"reviews": [
{
"reviewer": "John Smith",
"reviewer_background": {
"highest_rank": "Excellent",
"latest_review": "10/15/2017 10:15:00.000 CST",
}
}, {
"reviewer": "Adam Smith",
"reviewer_background": {
"highest_rank": "Good",
"latest_review": "10/10/2017 16:18:00.000 CST",
}
}
],
"checkouts": [
{
"member_id": "aaabbbccc",
"member_name": "Sam Jackson"
},{
"member_id": "bbbcccddd",
"member_name": "Buddy Jones"
}
]
}
On 12/2/2017 1:55 PM, David Lee wrote:
Hi all,
I've been trying for some time now to find a suitable way to deal with
json documents that have nested data. By suitable, I mean being able
to index them and retrieve them so that they are in the same structure
as when indexed.
I'm using version 7.1 under linux Mint 18.3 with Oracle Java
1.8.0_151. After untarring the distribution, I ran through the
"getting started" tutorial from the reference manual where it had me
create the techproducts index. I then created another collection
called my_collection so I could run the examples more easily. It used
the _default schema.
Here is a sample:
{
"book_id": "1234", "book_title": "The Martian Chronicles",
"author": "Ray Bradbury", "reviews": [ { "reviewer": "John
Smith", "reviewer_background": {
"highest_rank": "Excellent", "latest_review": "10/15/2017 10:15:00.000
CST", } }, { "reviewer": "Adam Smith",
"reviewer_background": { "highest_rank": "Good",
"latest_review": "10/10/2017 16:18:00.000 CST", }
} ], "checkouts": [ { "member_id": "aaabbbccc", "member_name":
"Sam Jackson" },{ "member_id": "bbbcccddd", "member_name":
"Buddy Jones" } ] }
Obviously, I'll need to search at the parent level and child level. I
started experimenting and tried to use one of the examples from
"Transforming and Indexing Solr JSON". However, when I tried the first
example as follows:
curl 'http://localhost:8983/solr/my_collection/update/json/docs'\
'?split=/exams'\
'&f=first:/first'\
'&f=last:/last'\
'&f=grade:/grade'\
'&f=subject:/exams/subject'\
'&f=test:/exams/test'\
'&f=marks:/exams/marks'\
-H 'Content-type:application/json' -d '
{
"first": "John",
"last": "Doe",
"grade": 8,
"exams": [
{
"subject": "Maths",
"test" : "term1",
"marks" : 90},
{
"subject": "Biology",
"test" : "term1",
"marks" : 86}
]
}'
{
"responseHeader":{
"status":0,
"QTime":798}}
Though the status indicates there was no error, when I try to query on
the the data using *:*, I get this:
curl 'http://localhost:8983/solr/my_collection/select?q=*:*'
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":6,
"params":{
"q":"*:*"}},
"response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
}}
So it looks like no documents were actually indexed from above. I'm
trying to determine if this is due to an error in the reference
manual, or if I haven't set up Solr correctly.
I've tried other techniques (not using the split option) like from
Yonik's site, but those are slightly dated and I was hoping there was
a more practical approach with the release of Solr 7.
Any assistance would be appreciated.
Thank you.