Re: Faceting on multivalued field
Are you implying to change the DB query of the nested entity which fetches the comments (query is in my post) or something can be done during the index like using Transformers etc. ? Thanks, Kaushik On Mon, Apr 4, 2011 at 8:07 AM, Erick Erickson wrote: > Why not count them on the way in and just store that number along > with the original e-mail? > > Best > Erick > > On Sun, Apr 3, 2011 at 10:10 PM, Kaushik Chakraborty >wrote: > > > Ok. My expectation was since "comment_post_id" is a MultiValued field > hence > > it would appear multiple times (i.e. for each comment). And hence when I > > would facet with that field it would also give me the count of those many > > documents where comment_post_id appears. > > > > My requirement is getting total for every document i.e. finding number of > > comments per post in the whole corpus. To explain it more clearly, I'm > > getting a result xml something like this > > > > 46 > > Hello World > > 20 > > > >9 > >10 > > > > > > 19 > > 2 > > > > > > 46 > > 46 > > > > > > Hello - from World > > Hi > > > > > > > > > > *1* > > > > I need the count to be 2 as the post 46 has 2 comments. > > > > What other way can I approach? > > > > Thanks, > > Kaushik > > > > > > On Mon, Apr 4, 2011 at 4:29 AM, Erick Erickson > >wrote: > > > > > Hmmm, I think you're misunderstanding faceting. It's counting the > > > number of documents that have a particular value. So if you're > > > faceting on "comment_post_id", there is one and only one document > > > with that value (assuming that the comment_post_ids are unique). > > > Which is what's being reported This will be quite expensive on a > > > large corpus, BTW. > > > > > > Is your task to show the totals for *every* document in your corpus or > > > just the ones in a display page? Because if the latter, your app could > > > just count up the number of elements in the XML returned for the > > > multiValued comments field. > > > > > > If that's not relevant, could you explain a bit more why you need this > > > count? > > > > > > Best > > > Erick > > > > > > On Sun, Apr 3, 2011 at 2:31 PM, Kaushik Chakraborty < > kaych...@gmail.com > > > >wrote: > > > > > > > Hi, > > > > > > > > My index contains a root entity "Post" and a child entity "Comments". > > > Each > > > > post can have multiple comments. data-config.xml: > > > > > > > > > > > > > > > dataSource="jdbc" query=""> > > > > > > > > > > > > > > > > > > > > query="select > > * > > > > from comments where post_id = ${posts.post_id}" > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The schema has all columns of "comment" entity as "MultiValued" > fields > > > and > > > > all fields are indexed & stored. My requirement is to count the > number > > of > > > > comments for each post. Approach I'm taking is to query on "*:*" and > > > > faceting the result on "comment_post_id" so that it gives the count > of > > > > comment occurred for that post. > > > > > > > > But I'm getting incorrect result e.g. if a post has 2 comments, the > > > > multivalued fields are populated alright but the facet count is > coming > > as > > > 1 > > > > (for that post_id). What else do I need to do? > > > > > > > > > > > > Thanks, > > > > Kaushik > > > > > > > > > >
Re: Faceting on multivalued field
Ok. My expectation was since "comment_post_id" is a MultiValued field hence it would appear multiple times (i.e. for each comment). And hence when I would facet with that field it would also give me the count of those many documents where comment_post_id appears. My requirement is getting total for every document i.e. finding number of comments per post in the whole corpus. To explain it more clearly, I'm getting a result xml something like this 46 Hello World 20 9 10 19 2 46 46 Hello - from World Hi *1* I need the count to be 2 as the post 46 has 2 comments. What other way can I approach? Thanks, Kaushik On Mon, Apr 4, 2011 at 4:29 AM, Erick Erickson wrote: > Hmmm, I think you're misunderstanding faceting. It's counting the > number of documents that have a particular value. So if you're > faceting on "comment_post_id", there is one and only one document > with that value (assuming that the comment_post_ids are unique). > Which is what's being reported This will be quite expensive on a > large corpus, BTW. > > Is your task to show the totals for *every* document in your corpus or > just the ones in a display page? Because if the latter, your app could > just count up the number of elements in the XML returned for the > multiValued comments field. > > If that's not relevant, could you explain a bit more why you need this > count? > > Best > Erick > > On Sun, Apr 3, 2011 at 2:31 PM, Kaushik Chakraborty >wrote: > > > Hi, > > > > My index contains a root entity "Post" and a child entity "Comments". > Each > > post can have multiple comments. data-config.xml: > > > > > > > dataSource="jdbc" query=""> > > > > > > > > > > > > > > > > > > > > > > > > > > > > The schema has all columns of "comment" entity as "MultiValued" fields > and > > all fields are indexed & stored. My requirement is to count the number of > > comments for each post. Approach I'm taking is to query on "*:*" and > > faceting the result on "comment_post_id" so that it gives the count of > > comment occurred for that post. > > > > But I'm getting incorrect result e.g. if a post has 2 comments, the > > multivalued fields are populated alright but the facet count is coming as > 1 > > (for that post_id). What else do I need to do? > > > > > > Thanks, > > Kaushik > > >
Faceting on multivalued field
Hi, My index contains a root entity "Post" and a child entity "Comments". Each post can have multiple comments. data-config.xml: The schema has all columns of "comment" entity as "MultiValued" fields and all fields are indexed & stored. My requirement is to count the number of comments for each post. Approach I'm taking is to query on "*:*" and faceting the result on "comment_post_id" so that it gives the count of comment occurred for that post. But I'm getting incorrect result e.g. if a post has 2 comments, the multivalued fields are populated alright but the facet count is coming as 1 (for that post_id). What else do I need to do? Thanks, Kaushik
Re: SOLR DIH importing MySQL "text" column as a BLOB
The query's there in the data-config.xml. And the query's fetching as expected from the database. Thanks, Kaushik On Wed, Mar 16, 2011 at 9:21 PM, Gora Mohanty wrote: > On Wed, Mar 16, 2011 at 2:29 PM, Stefan Matheis > wrote: > > Kaushik, > > > > i just remembered an ML-Post few weeks ago .. same problem while > > importing geo-data > > ( > http://lucene.472066.n3.nabble.com/Solr-4-0-Spatial-Search-How-to-tp2245592p2254395.html > ) > > - the solution was: > > > >> CAST( CONCAT( lat, ',', lng ) AS CHAR ) > > > > at that time i search a little bit for the reason and afaik there was > > a bug in mysql/jdbc which produces that binary output under certain > > conditions > [...] > > As Stefan mentions, there might be a way to solve this. > > Could you show us the query in DIH that you are using > when you get this BLOB, i.e., the SELECT statement > that goes to the database? > > It might also be instructive for you to try that same > SELECT directly in a mysql interface. > > Regards, > Gora >
SOLR DIH importing MySQL "text" column as a BLOB
I've a column for posts in MySQL of type `text`, I've tried corresponding `field-type` for it in Solr `schema.xml` e.g. `string, text, text-ws`. But whenever I'm importing it using the DIH, it's getting imported as a BLOB object. I checked, this thing is happening only for columns of type `text` and not for `varchar`(they are getting indexed as string). Hence, the posts field is not becoming searchable. I found about this issue, after repeated search failures, when I did a `*:*` query search on Solr. A sample response: 1.0 [B@10a33ce2 2011-02-21T07:02:55Z test.acco...@gmail.com Test Account [B@2c93c4f1 1 The `data-config.xml` : The `schema.xml` : solr_post_status_message_id solr_post_message Thanks, Kaushik