Re: How To apply transformation in DIH for multivalued numeric field?

2012-07-19 Thread jmlucjav
I have seen that issue several times, in my case it was always with an id
field, mysql db and linux. Same config but on windows did not show that
issue. 

Never got to the bottom of it...as it was an id it was just working as it
was unique. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-To-apply-transformation-in-DIH-for-multivalued-numeric-field-tp3995810p3995927.html
Sent from the Solr - User mailing list archive at Nabble.com.


How To apply transformation in DIH for multivalued numeric field?

2012-07-18 Thread Pranav Prakash
I have a multivalued integer field and a multivalued string field defined
in my schema as

field name=community_tag_ids
type=integer
indexed=true
stored=true
multiValued=true
omitNorms=true /
field name=community_tags
type=text
indexed=true
termVectors=true
stored=true
multiValued=true
omitNorms=true /


The DIH entity and field defn for the same goes as

entity name=document
  dataSource=app
  onError=skip
  transformer=RegexTransformer
  query=...

 entity name=community_tags
transformer=RegexTransformer
query=SELECT
group_concat(a.id SEPARATOR ',') AS community_tag_ids,
group_concat(a.title SEPARATOR ',') AS community_tags
FROM tags a JOIN tag_dets b ON a.id = b.tag_id
WHERE b.doc_id = ${document.id} 
field column=community_tag_ids name=community_tag_ids/
field column=community_tags splitBy=, /
  /entity

/entity

The value for field community_tags comes correctly as an array of strings.
However the value of field community_tag_ids is not proper

arr name=community_tag_ids
int[B@390c0a18/int
/arr

I tried chaining NumberFormatTransformer with formatStyle=number but that
throws DataImportHandlerException: Failed to apply NumberFormat on column.
Could it be due to NULL values from database or because the value is not
proper? How do we handle NULL in this case?


*Pranav Prakash*

temet nosce


RE: How To apply transformation in DIH for multivalued numeric field?

2012-07-18 Thread Dyer, James
Don't you want to specify splitBy for the integer field too?

Actually though, you shouldn't need to use GROUP_CONCAT and RegexTransformer at 
all.  DIH is designed to handle 1many relations between parent and child 
entities by populating all the child fields as multi-valued automatically.  I 
guess your approach leads to a lot fewer rows getting sent from your db to Solr 
though.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Pranav Prakash [mailto:pra...@gmail.com] 
Sent: Wednesday, July 18, 2012 2:38 PM
To: solr-user@lucene.apache.org
Subject: How To apply transformation in DIH for multivalued numeric field?

I have a multivalued integer field and a multivalued string field defined
in my schema as

field name=community_tag_ids
type=integer
indexed=true
stored=true
multiValued=true
omitNorms=true /
field name=community_tags
type=text
indexed=true
termVectors=true
stored=true
multiValued=true
omitNorms=true /


The DIH entity and field defn for the same goes as

entity name=document
  dataSource=app
  onError=skip
  transformer=RegexTransformer
  query=...

 entity name=community_tags
transformer=RegexTransformer
query=SELECT
group_concat(a.id SEPARATOR ',') AS community_tag_ids,
group_concat(a.title SEPARATOR ',') AS community_tags
FROM tags a JOIN tag_dets b ON a.id = b.tag_id
WHERE b.doc_id = ${document.id} 
field column=community_tag_ids name=community_tag_ids/
field column=community_tags splitBy=, /
  /entity

/entity

The value for field community_tags comes correctly as an array of strings.
However the value of field community_tag_ids is not proper

arr name=community_tag_ids
int[B@390c0a18/int
/arr

I tried chaining NumberFormatTransformer with formatStyle=number but that
throws DataImportHandlerException: Failed to apply NumberFormat on column.
Could it be due to NULL values from database or because the value is not
proper? How do we handle NULL in this case?


*Pranav Prakash*

temet nosce



Re: How To apply transformation in DIH for multivalued numeric field?

2012-07-18 Thread Pranav Prakash
I had tried with splitBy for numeric field, but that also did not worked
for me. However I got rid of group_concat and it was all good to go.

Thanks a lot!! I really had a difficult time understanding this behavior.


*Pranav Prakash*

temet nosce



On Thu, Jul 19, 2012 at 1:34 AM, Dyer, James james.d...@ingrambook.comwrote:

 Don't you want to specify splitBy for the integer field too?

 Actually though, you shouldn't need to use GROUP_CONCAT and
 RegexTransformer at all.  DIH is designed to handle 1many relations
 between parent and child entities by populating all the child fields as
 multi-valued automatically.  I guess your approach leads to a lot fewer
 rows getting sent from your db to Solr though.

 James Dyer
 E-Commerce Systems
 Ingram Content Group
 (615) 213-4311


 -Original Message-
 From: Pranav Prakash [mailto:pra...@gmail.com]
 Sent: Wednesday, July 18, 2012 2:38 PM
 To: solr-user@lucene.apache.org
 Subject: How To apply transformation in DIH for multivalued numeric field?

 I have a multivalued integer field and a multivalued string field defined
 in my schema as

 field name=community_tag_ids
 type=integer
 indexed=true
 stored=true
 multiValued=true
 omitNorms=true /
 field name=community_tags
 type=text
 indexed=true
 termVectors=true
 stored=true
 multiValued=true
 omitNorms=true /


 The DIH entity and field defn for the same goes as

 entity name=document
   dataSource=app
   onError=skip
   transformer=RegexTransformer
   query=...

  entity name=community_tags
 transformer=RegexTransformer
 query=SELECT
 group_concat(a.id SEPARATOR ',') AS community_tag_ids,
 group_concat(a.title SEPARATOR ',') AS community_tags
 FROM tags a JOIN tag_dets b ON a.id = b.tag_id
 WHERE b.doc_id = ${document.id} 
 field column=community_tag_ids name=community_tag_ids/
 field column=community_tags splitBy=, /
   /entity

 /entity

 The value for field community_tags comes correctly as an array of strings.
 However the value of field community_tag_ids is not proper

 arr name=community_tag_ids
 int[B@390c0a18/int
 /arr

 I tried chaining NumberFormatTransformer with formatStyle=number but that
 throws DataImportHandlerException: Failed to apply NumberFormat on column.
 Could it be due to NULL values from database or because the value is not
 proper? How do we handle NULL in this case?


 *Pranav Prakash*

 temet nosce