Storing 2 dimension array in Solr

2013-10-12 Thread David Philip
Hi,

  I have a 2 dimension array and want it to be persisted in solr. How can I
do that?

Sample case:

 disease1disease2 disease3
group1exist slight  not found
groups2   slightnot foundexist
group2slight exist

exist-1 not found - 2 slight-3 .. can be stored like this also.

Note: This array has frequent updates.  Every time new disease get's added
and I have to add description about that disease to all groups. And at
query time, I will do get by row  - get by group only group = group2 row.

Any suggestion on how I can achieve this?  I am thankful to the forum for
replying with patience, on achieving this, i will blog and will share it
with all.

Thanks - David


Re: Solr's Filtering approaches

2013-10-12 Thread Roman Chyla
David,
We have a similar query in astrophysics, an user can select an area of the
skymany stars out there

I am long overdue in creating a Jira issue, but here you have another
efficient mechanism for searching large number of ids

https://github.com/romanchyla/montysolr/blob/master/contrib/adsabs/src/java/org/apache/solr/search/BitSetQParserPlugin.java

Roman
On 12 Oct 2013 01:57, David Philip davidphilipshe...@gmail.com wrote:

 Groups are pharmaceutical research expts.. User is presented with graph
 view, he can select some region and all the groups in that region gets
 included..user can modify the groups also here.. so we didn't maintain
 group information in same solr index but we have externalized.
 I looked at post filter article. So my understanding is that, I simply have
 to extended as you did and should include implementaton for
 isAllowed(acls[doc], groups) .This will filter the documents in the
 collector and finally this collector will be returned. am I right?

   @Override
   public void collect(int doc) throws IOException {
 if (isAllowed(acls[doc], user, groups)) super.collect(doc);
   }


 Erick, I am interested to know whether I can extend any class that can
 return me only the bitset of the documents that match the search query. I
 can then do bitset1.andbitset2OfGroups - finally, collect only those
 documents to return to user. How do I try this approach? Any pointers for
 bit set?

 Thanks - David




 On Thu, Oct 10, 2013 at 5:25 PM, Erick Erickson erickerick...@gmail.com
 wrote:

  Well, my first question is why 50K groups is necessary, and
  whether you can simplify that. How a user can manually
  choose from among that many groups is interesting. But
  assuming they're all necessary, I can think of two things.
 
  If the user can only select ranges, just put in filter queries
  using ranges. Or possibly both ranges and individual entries,
  as fq=group:[1A TO 1A] OR group:(2B 45C 98Z) etc.
  You need to be a little careful how you put index these so
  range queries work properly, in the above you'd miss
  2A because it's sorting lexicographically, you'd need to
  store in some form that sorts like 001A 01A
  and so on. You wouldn't need to show that form to the
  user, just form your fq's in the app to work with
  that form.
 
  If that won't work (you wouldn't want this to get huge), think
  about a post filter that would only operate on documents that
  had made it through the select, although how to convey which
  groups the user selected to the post filter is an open
  question.
 
  Best,
  Erick
 
  On Wed, Oct 9, 2013 at 12:23 PM, David Philip
  davidphilipshe...@gmail.com wrote:
   Hi All,
  
   I have an issue in handling filters for one of our requirements and
   liked to get suggestion  for the best approaches.
  
  
   *Use Case:*
  
   1.  We have List of groups and the number of groups can increase upto
 1
   million. Currently we have almost 90 thousand groups in the solr search
   system.
  
   2.  Just before the user hits a search, He has options to select the
 no.
  of
groups he want to retrieve. [the distinct list of these group Names
 for
   display are retrieved from other solr index that has more information
  about
   groups]
  
   *3.User Operation:** *
   Say if user selected group 1A  - group 1A.  and searches for
  key:cancer.
  
  
   The current approach I was thinking is : get search results and filter
   query by groupids' list selected by user. But my concern is When these
   groups list is increasing to 50k unique Ids, This can cause lot of
 delay
   in getting search results. So wanted to know whether there are
 different
filtering ways that I can try for?
  
   I was thinking of one more approach as suggested by my colleague to do
 -
intersection.  -
   Get the groupIds' selected by user.
   Get the list of groupId's from search results,
   Perform intersection of both and then get the entire result set of only
   those groupid that intersected. Is this better way? Can I use any cache
   technique in this case?
  
  
   - David.
 



Re: Replace NULL with 0 while Indexing

2013-10-12 Thread Arcadius Ahouansou
What about using COALESCE in SQL?

like:
select COALESCE(duration, 0) as duration from mytable



On 11 October 2013 22:02, keshari.prerna keshari.pre...@gmail.com wrote:

 Hello,

 One of my indexing field have NULL values and i want it to be replaces with
 0 while indexing itself. So that when i search after indexing it gives me 0
 instead of NULL.

 This is my data-config.xml and duration is the field which has null values.

 dataConfig
   dataSource type=JdbcDataSource
   driver=com.mysql.jdbc.Driver
url=jdbc:mysql://trdbadhoc/test_results
   responseBuffering=adaptive
   batchSize=-1
   user=results
   password=resultsloader/
document
 entity name=Test_Syndrome
 pk=id
   query=SELECT TS.id AS id, TET.type AS error_type, TS.syndrome AS
 syndrome, S.start_date, SE.session_id AS sessionid,
 S.duration, TL.logfile, J.job_number AS job, cluster,
 S.hostname, platform FROM Test_Syndrome AS TS
 STRAIGHT_JOIN Session_Errors AS SE ON (SE.test_syndrome_id = TS.id)
 STRAIGHT_JOIN Session AS S ON (S.id = SE.session_id)
 STRAIGHT_JOIN Test_Run AS TR ON (TR.session_id = SE.session_id)
 STRAIGHT_JOIN  Test_Log AS TL ON (TL.id = TR.test_log_id)
 STRAIGHT_JOIN  Job AS J ON (J.id = TL.job_id)
 STRAIGHT_JOIN  Cluster AS C ON (C.id = J.cluster_id)
 STRAIGHT_JOIN  Platform ON (TR.platform_id = Platform.id)
 STRAIGHT_JOIN Test_Error_Type TET ON (SE.test_error_type_id =
 TET.id)

   Field column=id name=id/
Field column=error_type name=error_type/
Field column=syndrome name=syndrome/
Field column=sessionid name=sessionid/
Field column=duration name=duration/
Field column=logfile name=logfile/
Field column=job name=job/
 Field column=cluster name=cluster/
Field column=hostname name=hostname/
Field column=platform name=platform/

 /entity
   /document
 /dataConfig

 Please help.

 Thanks  Regards,
 Prerna





 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Replace-NULL-with-0-while-Indexing-tp4095059.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Replace NULL with 0 while Indexing

2013-10-12 Thread Karol Sikora

Or you can use custom update preprocessor.


W dniu 11.10.2013 23:02, keshari.prerna pisze:

Hello,

One of my indexing field have NULL values and i want it to be replaces with
0 while indexing itself. So that when i search after indexing it gives me 0
instead of NULL.

This is my data-config.xml and duration is the field which has null values.

dataConfig
   dataSource type=JdbcDataSource
   driver=com.mysql.jdbc.Driver
url=jdbc:mysql://trdbadhoc/test_results
   responseBuffering=adaptive
   batchSize=-1
   user=results
   password=resultsloader/
document
entity name=Test_Syndrome
 pk=id
   query=SELECT TS.id AS id, TET.type AS error_type, TS.syndrome AS
syndrome, S.start_date, SE.session_id AS sessionid,
 S.duration, TL.logfile, J.job_number AS job, cluster,
 S.hostname, platform FROM Test_Syndrome AS TS
 STRAIGHT_JOIN Session_Errors AS SE ON (SE.test_syndrome_id = TS.id)
 STRAIGHT_JOIN Session AS S ON (S.id = SE.session_id)
 STRAIGHT_JOIN Test_Run AS TR ON (TR.session_id = SE.session_id)
 STRAIGHT_JOIN  Test_Log AS TL ON (TL.id = TR.test_log_id)
 STRAIGHT_JOIN  Job AS J ON (J.id = TL.job_id)
 STRAIGHT_JOIN  Cluster AS C ON (C.id = J.cluster_id)
 STRAIGHT_JOIN  Platform ON (TR.platform_id = Platform.id)
 STRAIGHT_JOIN Test_Error_Type TET ON (SE.test_error_type_id =
TET.id)
   
   Field column=id name=id/

Field column=error_type name=error_type/
Field column=syndrome name=syndrome/
Field column=sessionid name=sessionid/
Field column=duration name=duration/
Field column=logfile name=logfile/
Field column=job name=job/
 Field column=cluster name=cluster/
Field column=hostname name=hostname/
Field column=platform name=platform/
   
 /entity

   /document
/dataConfig

Please help.

Thanks  Regards,
Prerna





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Replace-NULL-with-0-while-Indexing-tp4095059.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
 
Karol Sikora

Kierownik Informatyczny Projektu CBN - Interfejs 2.0
+48 781 493 788

Laboratorium EE
ul. Mokotowska 46A/23 | 00-543 Warszawa |
www.laboratorium.ee | www.laboratorium.ee/facebook



Re: Storing 2 dimension array in Solr

2013-10-12 Thread Erick Erickson
David:

This feels like it may be an XY problem. _Why_ do you
want to store a 2-dimensional array and what
do you want to do with it? Maybe there are better
approaches.

Best
Erick


On Sat, Oct 12, 2013 at 2:07 AM, David Philip
davidphilipshe...@gmail.comwrote:

 Hi,

   I have a 2 dimension array and want it to be persisted in solr. How can I
 do that?

 Sample case:

  disease1disease2 disease3
 group1exist slight  not found
 groups2   slightnot foundexist
 group2slight exist

 exist-1 not found - 2 slight-3 .. can be stored like this also.

 Note: This array has frequent updates.  Every time new disease get's added
 and I have to add description about that disease to all groups. And at
 query time, I will do get by row  - get by group only group = group2 row.

 Any suggestion on how I can achieve this?  I am thankful to the forum for
 replying with patience, on achieving this, i will blog and will share it
 with all.

 Thanks - David



Re: Storing 2 dimension array in Solr

2013-10-12 Thread David Philip
Hi Erick,

   We have set of groups as represented below. New columns (diseases as in
below matrix) keep coming and we need to add them as new column. To that
column, we have values such as 1 or 2 or 3 or 4 (exist, slight, na,
notfound) for respective groups.

While querying we need  to get the entire row for group:group1.  We will
not be searching on columns(*_disease) values, index=false but stored is
true.

for ex: we use, get group:group1 and we need to get the entire row-
exist,slight, not found. Hoping this explanation is clearer.

   disease1disease2 disease3
group1exist slight  not found
groups2   slightnot foundexist
group3slight exist
groupK-na exist



Thanks - David





On Sat, Oct 12, 2013 at 11:39 PM, Erick Erickson erickerick...@gmail.comwrote:

 David:

 This feels like it may be an XY problem. _Why_ do you
 want to store a 2-dimensional array and what
 do you want to do with it? Maybe there are better
 approaches.

 Best
 Erick


 On Sat, Oct 12, 2013 at 2:07 AM, David Philip
 davidphilipshe...@gmail.comwrote:

  Hi,
 
I have a 2 dimension array and want it to be persisted in solr. How
 can I
  do that?
 
  Sample case:
 
   disease1disease2 disease3
  group1exist slight  not found
  groups2   slightnot foundexist
  group2slight exist
 
  exist-1 not found - 2 slight-3 .. can be stored like this also.
 
  Note: This array has frequent updates.  Every time new disease get's
 added
  and I have to add description about that disease to all groups. And at
  query time, I will do get by row  - get by group only group = group2 row.
 
  Any suggestion on how I can achieve this?  I am thankful to the forum for
  replying with patience, on achieving this, i will blog and will share it
  with all.
 
  Thanks - David
 



Re: Profiling Solr Lucene for query

2013-10-12 Thread Manuel Le Normand
Would adding a dummy shard instead of a dummy collection would resolve the
situation? - e.g. editing clusterstate.json from a zookeeper client and
adding a shard with a 0-range so no docs are routed to this core. This core
would be on a separate server and act as the collection gateway.


Re: Storing 2 dimension array in Solr

2013-10-12 Thread Erick Erickson
Isn't this just indexing each row as a separate document
with a suitable ID groupN in your example?


On Sat, Oct 12, 2013 at 2:43 PM, David Philip
davidphilipshe...@gmail.comwrote:

 Hi Erick,

We have set of groups as represented below. New columns (diseases as in
 below matrix) keep coming and we need to add them as new column. To that
 column, we have values such as 1 or 2 or 3 or 4 (exist, slight, na,
 notfound) for respective groups.

 While querying we need  to get the entire row for group:group1.  We will
 not be searching on columns(*_disease) values, index=false but stored is
 true.

 for ex: we use, get group:group1 and we need to get the entire row-
 exist,slight, not found. Hoping this explanation is clearer.

disease1disease2 disease3
 group1exist slight  not found
 groups2   slightnot foundexist
 group3slight exist
 groupK-na exist



 Thanks - David





 On Sat, Oct 12, 2013 at 11:39 PM, Erick Erickson erickerick...@gmail.com
 wrote:

  David:
 
  This feels like it may be an XY problem. _Why_ do you
  want to store a 2-dimensional array and what
  do you want to do with it? Maybe there are better
  approaches.
 
  Best
  Erick
 
 
  On Sat, Oct 12, 2013 at 2:07 AM, David Philip
  davidphilipshe...@gmail.comwrote:
 
   Hi,
  
 I have a 2 dimension array and want it to be persisted in solr. How
  can I
   do that?
  
   Sample case:
  
disease1disease2 disease3
   group1exist slight  not found
   groups2   slightnot foundexist
   group2slight exist
  
   exist-1 not found - 2 slight-3 .. can be stored like this also.
  
   Note: This array has frequent updates.  Every time new disease get's
  added
   and I have to add description about that disease to all groups. And at
   query time, I will do get by row  - get by group only group = group2
 row.
  
   Any suggestion on how I can achieve this?  I am thankful to the forum
 for
   replying with patience, on achieving this, i will blog and will share
 it
   with all.
  
   Thanks - David
  
 



Re: SolrCloud on SSL

2013-10-12 Thread Shawn Heisey
On 10/11/2013 9:38 AM, Christopher Gross wrote:
 On Fri, Oct 11, 2013 at 11:08 AM, Shawn Heisey s...@elyograg.org wrote:
 
 On 10/11/2013 8:17 AM, Christopher Gross wrote: 
 Is there a spot in a Solr configuration that I can set this up to use
 HTTPS?

 From what I can tell, not yet.

 https://issues.apache.org/jira/browse/SOLR-3854
 https://issues.apache.org/jira/browse/SOLR-4407
 https://issues.apache.org/jira/browse/SOLR-4470


 Dang.

Christopher,

I was just looking through Solr source code for a completely different
issue, and it seems that there *IS* a way to do this in your configuration.

If you were to use https://hostname; or https://ipaddress; as the
host parameter in your solr.xml file on each machine, it should do
what you want.  The parameter is described here, but not the behavior
that I have discovered:

http://wiki.apache.org/solr/SolrCloud#SolrCloud_Instance_Params

Boring details: In the org.apache.solr.cloud package, there is a
ZkController class.  The getHostAddress method is where I discovered
that you can do this.

If you could try this out and confirm that it works, I will get the wiki
page updated and look into the Solr reference guide as well.

Thanks,
Shawn



Re: Storing 2 dimension array in Solr

2013-10-12 Thread David Philip
Hi Erick, Yes it is. But the columns here are dynamically and very
frequently added.They can increase upto 1 million right now. So, 1 document
with 1 million dynamic fields, is it fine? Or any other approach?

While searching through web, I found that docValues are column oriented.
http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/
However,  I did not understand, how to use docValues to add these columns.

What is the recommended approach?

Thanks - David






On Sun, Oct 13, 2013 at 3:33 AM, Erick Erickson erickerick...@gmail.comwrote:

 Isn't this just indexing each row as a separate document
 with a suitable ID groupN in your example?


 On Sat, Oct 12, 2013 at 2:43 PM, David Philip
 davidphilipshe...@gmail.comwrote:

  Hi Erick,
 
 We have set of groups as represented below. New columns (diseases as
 in
  below matrix) keep coming and we need to add them as new column. To that
  column, we have values such as 1 or 2 or 3 or 4 (exist, slight, na,
  notfound) for respective groups.
 
  While querying we need  to get the entire row for group:group1.  We
 will
  not be searching on columns(*_disease) values, index=false but stored is
  true.
 
  for ex: we use, get group:group1 and we need to get the entire row-
  exist,slight, not found. Hoping this explanation is clearer.
 
 disease1disease2 disease3
  group1exist slight  not found
  groups2   slightnot foundexist
  group3slight exist
  groupK-na exist
 
 
 
  Thanks - David
 
 
 
 
 
  On Sat, Oct 12, 2013 at 11:39 PM, Erick Erickson 
 erickerick...@gmail.com
  wrote:
 
   David:
  
   This feels like it may be an XY problem. _Why_ do you
   want to store a 2-dimensional array and what
   do you want to do with it? Maybe there are better
   approaches.
  
   Best
   Erick
  
  
   On Sat, Oct 12, 2013 at 2:07 AM, David Philip
   davidphilipshe...@gmail.comwrote:
  
Hi,
   
  I have a 2 dimension array and want it to be persisted in solr. How
   can I
do that?
   
Sample case:
   
 disease1disease2 disease3
group1exist slight  not found
groups2   slightnot foundexist
group2slight exist
   
exist-1 not found - 2 slight-3 .. can be stored like this also.
   
Note: This array has frequent updates.  Every time new disease get's
   added
and I have to add description about that disease to all groups. And
 at
query time, I will do get by row  - get by group only group = group2
  row.
   
Any suggestion on how I can achieve this?  I am thankful to the forum
  for
replying with patience, on achieving this, i will blog and will share
  it
with all.
   
Thanks - David
   
  
 



Re: Storing 2 dimension array in Solr

2013-10-12 Thread Jack Krupansky
You may be better off indexing each element of the array as a solr document, 
with a group field and a disease field. Then you can easily and efficiently 
add new diseases. Then to query a row, you query for the group field having 
the desired group.


If possible, index the array as being sparse - no document for a disease if 
it is not present for that group.


-- Jack Krupansky

-Original Message- 
From: David Philip

Sent: Saturday, October 12, 2013 9:56 PM
To: solr-user@lucene.apache.org
Subject: Re: Storing 2 dimension array in Solr

Hi Erick, Yes it is. But the columns here are dynamically and very
frequently added.They can increase upto 1 million right now. So, 1 document
with 1 million dynamic fields, is it fine? Or any other approach?

While searching through web, I found that docValues are column oriented.
http://searchhub.org/2013/04/02/fun-with-docvalues-in-solr-4-2/
However,  I did not understand, how to use docValues to add these columns.

What is the recommended approach?

Thanks - David






On Sun, Oct 13, 2013 at 3:33 AM, Erick Erickson 
erickerick...@gmail.comwrote:



Isn't this just indexing each row as a separate document
with a suitable ID groupN in your example?


On Sat, Oct 12, 2013 at 2:43 PM, David Philip
davidphilipshe...@gmail.comwrote:

 Hi Erick,

We have set of groups as represented below. New columns (diseases as
in
 below matrix) keep coming and we need to add them as new column. To that
 column, we have values such as 1 or 2 or 3 or 4 (exist, slight, na,
 notfound) for respective groups.

 While querying we need  to get the entire row for group:group1.  We
will
 not be searching on columns(*_disease) values, index=false but stored is
 true.

 for ex: we use, get group:group1 and we need to get the entire row-
 exist,slight, not found. Hoping this explanation is clearer.

disease1disease2 disease3
 group1exist slight  not found
 groups2   slightnot foundexist
 group3slight exist
 groupK-na exist



 Thanks - David





 On Sat, Oct 12, 2013 at 11:39 PM, Erick Erickson 
erickerick...@gmail.com
 wrote:

  David:
 
  This feels like it may be an XY problem. _Why_ do you
  want to store a 2-dimensional array and what
  do you want to do with it? Maybe there are better
  approaches.
 
  Best
  Erick
 
 
  On Sat, Oct 12, 2013 at 2:07 AM, David Philip
  davidphilipshe...@gmail.comwrote:
 
   Hi,
  
 I have a 2 dimension array and want it to be persisted in solr. 
   How

  can I
   do that?
  
   Sample case:
  
disease1disease2 disease3
   group1exist slight  not found
   groups2   slightnot foundexist
   group2slight exist
  
   exist-1 not found - 2 slight-3 .. can be stored like this also.
  
   Note: This array has frequent updates.  Every time new disease get's
  added
   and I have to add description about that disease to all groups. And
at
   query time, I will do get by row  - get by group only group = group2
 row.
  
   Any suggestion on how I can achieve this?  I am thankful to the 
   forum

 for
   replying with patience, on achieving this, i will blog and will 
   share

 it
   with all.
  
   Thanks - David
  
 






Re: SolrCore 'collection1' is not available due to init failure

2013-10-12 Thread Jim_Armstrong
Liu Bo,

Changing the permissions fixed the problem.  Thank you for helping me.

Best regards, Jim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCore-collection1-is-not-available-due-to-init-failure-tp4094869p4095195.html
Sent from the Solr - User mailing list archive at Nabble.com.