Solr Schema and how?

2011-10-04 Thread caman
Hello all,

We have a screen builder application where users design their own forms.
They have a choice of create forms fields with type date, text,numbers,large
text etc upto total of 500 fields supported on a screen. 
Once screens are designed system automatically handle the type checking for
valid data entries on front end even though data of any type gets stored as
text. 
So as you can imagine, table is huge with 600+
columns(screenId,recordId,field1 ...field500) and every column is set as
'text'. Same table stores data for every screen designed in the system.

So basically here are my questions

1. How best to index it? I did it using dynamic field 'field*' which works
great
2. Since everything is text,not sure how to enable filtering on each field
e.g. If a user wants to enable 'greater than' or 'less then' type of queries
on a number field (stored as text), somehow that data needs to be stored as
number in SOLR but I don't think I have a way to do that.  I can't do that
Since 'field2' may be be a 'number' field for a 'screen1' and 'date' for
screen2. 




Would appreciate any ideas to handle this? 



thanks

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Schema-and-how-tp3393989p3393989.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: how can i develop client application with solr url using javascript?

2011-08-22 Thread caman
search 'ajax-solr' on google.  To handle solr url, look at establishing a
proxy
Good luck.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-can-i-develop-client-application-with-solr-url-using-javascript-tp3275506p3276269.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.0 => Spatial Search - How to

2011-01-14 Thread caman


CONCAT(CAST(lat as CHAR),',',CAST(lng as CHAR))
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-Spatial-Search-How-to-tp2245592p2254151.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.0 => Spatial Search - How to

2011-01-13 Thread caman

Thanks
Here was the issues. Concatenating 2 floats(lat,lng) at mysql end converted
it to a BLOB. Indexing would fail in storing BLOB in 'location' type field.
After BLOB issue was resolved, all worked ok.

Thank you all for your help



-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-Spatial-Search-How-to-tp2245592p2253691.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.0 => Spatial Search - How to

2011-01-12 Thread caman

Adam,

thanks. Yes that helps
but how does coords fields get populated? All I have is 






fields 'lat' and  'lng' get populated by dataimporthandler but coord, am not
sure?

Thanks
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-Spatial-Search-How-to-tp2245592p2245709.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr 4.0 => Spatial Search - How to

2011-01-12 Thread caman

Ok, this could be very easy to do but was not able to do this.
Need to enable location search i.e. if someone searches for location 'New
York' => show results for New York and results within 50 miles of New York.
We do have latitude/longitude stored in database for each record but not
sure how to index these values to enable spatial search.
Any help would be much appreciated.

thanks
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-0-Spatial-Search-How-to-tp2245592p2245592.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Query modification

2010-07-02 Thread caman

And what did you use for entity detection?

GATE,openNLP?

Do you mind sharing that please?

 

From: Tommy Chheng-2 [via Lucene]
[mailto:ml-node+939600-682384129-124...@n3.nabble.com] 
Sent: Friday, July 02, 2010 3:20 PM
To: caman
Subject: Re: Query modification

 

  Hi, 
I actually did something similar on http://researchwatch.net/
if you search for "stanford university solar", it will process the query 
by tagging the stanford university to the organization field. 

I created a querycomponent class and altered the query string like 
this(in scala but translatable to java easily): 
   override def prepare(rb: ResponseBuilder){ 
 val params: SolrParams = rb.req.getParams 

 if(params.getBool(COMPONENT_NAME, false)){ 
   val queryString = params.get("q").trim //rb.getQueryString() 
   val entityTransform = new ClearboxEntityDetection 
   val (transformedQuery, explainMap) = 
entityTransform.transformQuery(queryString) 

   rb.setQueryString(transformedQuery) 
   rb.rsp.add("clearboxExplain", explainMap) 
 } 
   } 


@tommychheng 
Programmer and UC Irvine Graduate Student 
Find a great grad school based on research interests:
http://gradschoolnow.com <http://gradschoolnow.com?by-user=t> 


On 7/2/10 3:12 PM, osocurious2 wrote: 


> If I wanted to intercept a query and turn 
>  q=romantic italian restaurant in seattle 
> into 
>  q=romantic tag:restaurant city:seattle cuisine:italian 
> 
> would I subclass QueryComponent, modify the query, and pass it to super?
Or 
> is there a standard way already to do this? 
> 
> What about changing it to 
> q=romantic city:seattle cuisine:italian&fq=type:restaurant 
> 
> would that be the same process, or is there a nuance to modifying a query 
> into a query+filterQuery? 
> 
> Ken 
> 

 

  _  

View message @
http://lucene.472066.n3.nabble.com/Query-modification-tp939584p939600.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-modification-tp939584p939614.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: DIH and denormalizing

2010-06-28 Thread caman

In your query 'query="SELECT webtable as wt FROM ncdat_wt WHERE 
featurecode='${ncdat.feature}'  .. instead of ${ncdat.feature} use
${dataTable.feature}  where dataTable is your parent entity name.

 

 

 

From: Shawn Heisey-4 [via Lucene]
[mailto:ml-node+929151-1527242139-124...@n3.nabble.com] 
Sent: Monday, June 28, 2010 2:24 PM
To: caman
Subject: DIH and denormalizing

 

I am trying to do some denormalizing with DIH from a MySQL source.   
Here's part of my data-config.xml: 

 
 
 
 

The relationship between features in ncdat and webtable in ncdat_wt (via 
featurecode) will be many-many.  The "wt" field in schema.xml is set up 
as multivalued. 

It seems that ${ncdat.feature} is not being set.  I saw a query 
happening on the server and it was "SELECT webtable as wt FROM ncdat_wt 
WHERE featurecode=''" - that last part is an empty string with single 
quotes around it.  From what I can tell, there are no entries in ncdat 
where feature is blank.  I've tried this with both a 1.5-dev checked out 
months ago (which we are using in production) and a 3.1-dev checked out 
today. 

Am I doing something wrong? 

Thanks, 
Shawn 




  _  

View message @
http://lucene.472066.n3.nabble.com/DIH-and-denormalizing-tp929151p929151.htm
l 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-and-denormalizing-tp929151p929168.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Stemmed and/or unStemmed field

2010-06-23 Thread caman

Ahh,perfect.

Will take a look. thanks

 

From: Robert Muir [via Lucene]
[mailto:ml-node+918302-232685105-124...@n3.nabble.com] 
Sent: Wednesday, June 23, 2010 4:17 PM
To: caman
Subject: Re: Stemmed and/or unStemmed field

 

On Wed, Jun 23, 2010 at 3:58 PM, Vishal A. 
<[hidden email]>wrote: 

> 
> Here is what I am trying to do :  Someone clicks on  'Comforters &
Pillows' 
> , we would want the results to be filtered where title has keyword 
> 'Comforter' or  'Pillows' but we have been getting results with word 
> 'comfort' in the title. I assume it is because of stemming. What is the 
> right way to handle this? 
> 

from your examples, it seems a more lightweight stemmer might be an easy 
option: https://issues.apache.org/jira/browse/LUCENE-2503

-- 
Robert Muir 
[hidden email] 



  _  

View message @
http://lucene.472066.n3.nabble.com/Stemmed-and-or-unStemmed-field-tp917876p9
18302.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Stemmed-and-or-unStemmed-field-tp917876p918309.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Can solr return pretty text as the content?

2010-06-23 Thread caman

Define Pretty text.

 

1)Are you talking about XML/JSON returned by SOLR is not pretty ?

If yes, try indent=on with your query params

 

2)Or talking about data in certain field? 

Solr returns what you feed it. Look at your filters for that field
type. Your filters/tokenizer may be stripping the formatting.

 

 

 

From: JohnRodey [via Lucene]
[mailto:ml-node+917912-920852633-124...@n3.nabble.com] 
Sent: Wednesday, June 23, 2010 1:19 PM
To: caman
Subject: Can solr return pretty text as the content?

 

When I feed pretty text into solr for indexing from lucene and search for
it, the content is always returned as one long line of text.  Is there a way
for solr to return the pretty formatted text to me? 

  _  

View message @
http://lucene.472066.n3.nabble.com/Can-solr-return-pretty-text-as-the-conten
t-tp917912p917912.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-solr-return-pretty-text-as-the-content-tp917912p917966.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: DIH full-import memory issue

2010-05-10 Thread caman

This may help:

batchSize : The batchsize used in jdbc connection

 

http://wiki.apache.org/solr/DataImportHandler#Configuring_DataSources

 

 

 

 

From: Geek Gamer [via Lucene]
[mailto:ml-node+809069-2054572211-124...@n3.nabble.com] 
Sent: Monday, May 10, 2010 9:42 PM
To: caman
Subject: DIH full-import memory issue

 

Hi, 

I am facing issues with DIH fullimport, 

I have a database with 3 million records that will translate into index size

of 6GB. 

When I am trying to do full import I am getting out of memory error like : 

INFO: Starting Full Import 
May 10, 2010 11:44:06 PM org.apache.solr.handler.dataimport.SolrWriter 
readIndexerProperties 
WARNING: Unable to read: dataimport.properties 
May 10, 2010 11:44:06 PM org.apache.solr.update.DirectUpdateHandler2 
deleteAll 
INFO: [] REMOVING ALL DOCUMENTS FROM INDEX 
May 10, 2010 11:44:06 PM org.apache.solr.core.SolrDeletionPolicy onInit 
INFO: SolrDeletionPolicy.onInit: commits:num=1 
commit{dir=/home/search/SOLR/solr/data/index,segFN=segments_1,version=127354
9043650,generation=1,filenames=[segments_1] 
May 10, 2010 11:44:06 PM org.apache.solr.core.SolrDeletionPolicy 
updateCommits 
INFO: newest commit = 1273549043650 
May 10, 2010 11:44:06 PM org.apache.solr.handler.dataimport.JdbcDataSource$1

call 
INFO: Creating a connection for entity offer with URL: 
jdbc:mysql://domU-12-31-39-10-59-01.compute-1.internal/jounce1 
May 10, 2010 11:44:07 PM org.apache.solr.handler.dataimport.JdbcDataSource$1

call 
INFO: Time taken for getConnection(): 301 



Exception in thread "Timer-1" java.lang.OutOfMemoryError: Java heap space 
at java.util.HashMap.newValueIterator(HashMap.java:843) 
at java.util.HashMap$Values.iterator(HashMap.java:910) 
at 
org.mortbay.jetty.servlet.HashSessionManager.scavenge(HashSessionManager.jav
a:180) 
at 
org.mortbay.jetty.servlet.HashSessionManager.access$000(HashSessionManager.j
ava:36) 
at 
org.mortbay.jetty.servlet.HashSessionManager$1.run(HashSessionManager.java:1
44) 
at java.util.TimerThread.mainLoop(Timer.java:512) 
at java.util.TimerThread.run(Timer.java:462) 
May 10, 2010 11:54:54 PM org.apache.solr.handler.dataimport.DataImporter 
doFullImport 
SEVERE: Full Import failed 
org.apache.solr.handler.dataimport.DataImportHandlerException: 
java.lang.OutOfMemoryError: Java heap space 
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:
424) 
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242
) 
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180) 
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.ja
va:331) 
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389
) 
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)

Caused by: java.lang.OutOfMemoryError: Java heap space 
at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:1621) 
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1398) 
at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:2816) 
at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:467) 
at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:2510) 
at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:1746) 
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2135) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2536) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2465) 
at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:734) 
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(J
dbcDataSource.java:246) 
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.jav
a:210) 
at 
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.jav
a:39) 
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityPro
cessor.java:58) 
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProce
ssor.java:71) 
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProc
essorWrapper.java:237) 
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:
357) 
... 5 more 
May 10, 2010 11:54:54 PM org.apache.solr.update.DirectUpdateHandler2 
rollback 
INFO: start rollback 
May 10, 2010 11:54:54 PM org.apache.solr.update.DirectUpdateHandler2 
rollback 
INFO: end_rollback 




I tried allocating 4 Gigs of memory to the VM but no luck. 
Are the records cached before indexing or streamed? 
any pointers to documents? 

thanks in anticipation, 
umar 



  _  

View message @
http://lucene.472066.n3.nabble.com/DIH-full-import-memory-issue-tp809069p809
069.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/DIH-full-imp

RE: JSON formatted response from SOLR question....

2010-05-10 Thread caman

Take a look at AjaxSolr source code:

 

http://github.com/evolvingweb/ajax-solr

 

This should give you exactly what you need.

 

thanks

 

 

 

 

From: Tod [via Lucene]
[mailto:ml-node+789105-593266572-124...@n3.nabble.com] 
Sent: Monday, May 10, 2010 7:22 AM
To: caman
Subject: JSON formatted response from SOLR question

 

I apologize, this is such a JSON/javascript question but I'm stuck and 
am not finding any resources that address this specifically. 

I'm doing a faceted search and getting back in my 
facet_counts.faceted_fields response an array of countries.  I'm 
gathering the count of the array elements returned using this notation: 

rsp.facet_counts.facet_fields.country.length 

... where rsp is the eval'ed JSON response from SOLR.  From there I just 
loop through listing the individual country with its associated count. 

The problem I am having is trying to automate this to loop through any 
one of a number of facets contained in my JSON response, not just 
country.  So instead of the above I would have something like: 

rsp.facet_counts.facet_fields.VARIABLE.length 

... where VARIABLE would be the name of one of the facets passed into a 
javascript function to perform the loop.  None of the javascript 
examples I can find seems to address this.  Has anyone run into this? 
Is there a better list to ask this question? 


Thanks in advance. 



  _  

View message @
http://lucene.472066.n3.nabble.com/JSON-formatted-response-from-SOLR-questio
n-tp789105p789105.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/JSON-formatted-response-from-SOLR-question-tp789105p789183.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Embedded Solr search query

2010-05-07 Thread caman

I would just look at SOLR source code and see how standard search handler
and dismaxSearchHandler are implemented.

Look under package 'org.apache.solr.
<http://hudson.zones.apache.org/hudson/job/Solr-trunk/clover/org/apache/solr
/handler/pkg-summary.html> handler'

 

 

 

From: Eric Grobler [via Lucene]
[mailto:ml-node+783212-2036924225-124...@n3.nabble.com] 
Sent: Friday, May 07, 2010 1:33 AM
To: caman
Subject: Re: Embedded Solr search query

 

Hi Camen, 

I was hoping someone has done it already :-) 
I am also new to Solr/lucene, can you perhaps point me to a request handler 
example page? 

Thanks and Regards 
Eric 

On Fri, May 7, 2010 at 9:05 AM, caman <[hidden email]>wrote: 


> 
> Why not write a custom request handler which can parse, split, execute and

> combine results to your queries? 
> 
> 
> 
> 
> 
> 
> 
> From: Eric Grobler [via Lucene] 
> [mailto:[hidden email]<[hidden email]> 
> ] 
> Sent: Friday, May 07, 2010 1:01 AM 
> To: caman 
> Subject: Embedded Solr search query 
> 
> 
> 
> Hello Solr community, 
> 
> When a user search on our web page, we need to run 3 related but different

> queries. 
> For SEO reasons, we cannot use Ajax so at the moment we run 3 queries 
> sequentially inside a PHP script. 
> Allthough Solr is superfast,  the extra network overhead can make the 3 
> queries 400ms slower than it needs to be. 
> 
> Thus my question is: 
> Is there a way whereby you can send 1 query string to Solr with 2 or more 
> embedded search queries, where Solr will split and execute the queries and

> return the results of the multiple searches in 1 go. 
> 
> In other words, instead of: 
> -  send searchQuery1 
>   get result1 
> -  send searchQuery2 
>   get result2 
> ... 
> 
> you run: 
> - send searchQuery1+searchQuery2 
> - get result1+result2 
> 
> Thanks and Regards 
> Eric 
> 
> 
> 
>   _ 
> 
> View message @ 
> 
>
http://lucene.472066.n3.nabble.com/Embedded-Solr-search-query-tp783150p78315
>
0.html<http://lucene.472066.n3.nabble.com/Embedded-Solr-search-query-tp78315
0p78315%0A0.html> 
> To start a new topic under Solr - User, email 
> [hidden email]<[hidden email]> 
> To unsubscribe from Solr - User, click 
> < (link removed) 
> GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 
> 
> 
> 
> 
> -- 
> View this message in context: 
>
http://lucene.472066.n3.nabble.com/Embedded-Solr-search-query-tp783150p78315
6.html
> Sent from the Solr - User mailing list archive at Nabble.com. 
> 

 

  _  

View message @
http://lucene.472066.n3.nabble.com/Embedded-Solr-search-query-tp783150p78321
2.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Embedded-Solr-search-query-tp783150p784098.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Help indexing PDF files

2010-05-07 Thread caman

Take a look at Tika library

 

From: Leonardo Azize Martins [via Lucene]
[mailto:ml-node+783677-325080270-124...@n3.nabble.com] 
Sent: Friday, May 07, 2010 6:37 AM
To: caman
Subject: Help indexing PDF files

 

Hi, 

I am new in Solr. 
I would like to index some PDF files. 

How can I do using example schema from 1.4.0 version? 

Regards, 
Leo 



  _  

View message @
http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p783677.h
tml 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-indexing-PDF-files-tp783677p784092.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Embedded Solr search query

2010-05-07 Thread caman

Why not write a custom request handler which can parse, split, execute and
combine results to your queries?

 

 

 

From: Eric Grobler [via Lucene]
[mailto:ml-node+783150-1027691461-124...@n3.nabble.com] 
Sent: Friday, May 07, 2010 1:01 AM
To: caman
Subject: Embedded Solr search query

 

Hello Solr community, 

When a user search on our web page, we need to run 3 related but different 
queries. 
For SEO reasons, we cannot use Ajax so at the moment we run 3 queries 
sequentially inside a PHP script. 
Allthough Solr is superfast,  the extra network overhead can make the 3 
queries 400ms slower than it needs to be. 

Thus my question is: 
Is there a way whereby you can send 1 query string to Solr with 2 or more 
embedded search queries, where Solr will split and execute the queries and 
return the results of the multiple searches in 1 go. 

In other words, instead of: 
-  send searchQuery1 
   get result1 
-  send searchQuery2 
   get result2 
... 

you run: 
- send searchQuery1+searchQuery2 
- get result1+result2 

Thanks and Regards 
Eric 



  _  

View message @
http://lucene.472066.n3.nabble.com/Embedded-Solr-search-query-tp783150p78315
0.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Embedded-Solr-search-query-tp783150p783156.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: run on reboot on windows

2010-05-02 Thread caman

Please take a look at this for tomcat

http://tomcat.apache.org/tomcat-6.0-doc/setup.html#Windows

 

and for jetty :

http://docs.codehaus.org/display/JETTY/Win32Wrapper

 

 

Hope this helps.

 

From: S Ahmed [via Lucene]
[mailto:ml-node+772182-2115387142-124...@n3.nabble.com] 
Sent: Sunday, May 02, 2010 4:44 PM
To: caman
Subject: Re: run on reboot on windows

 

its not tomcat/jetty that's the issue, its how to get things to re-start on 
a windows server (tomcat and jetty don't run as native windows services) so 
I am a little confused..thanks. 

On Sun, May 2, 2010 at 7:37 PM, caman <[hidden email]>wrote: 


> 
> Ahmed, 
> 
> 
> 
> Best is if you take a look at the documentation of jetty or tomcat. SOLR 
> can 
> run on any web container, it's up to you how you  configure your web 
> container to run 
> 
> 
> 
> Thanks 
> 
> Aboxy 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> From: S Ahmed [via Lucene] 
> [mailto:[hidden email]<[hidden email]> 
> ] 
> Sent: Sunday, May 02, 2010 4:33 PM 
> To: caman 
> Subject: Re: run on reboot on windows 
> 
> 
> 
> By default it uses Jetty, so your saying Tomcat on windows server 2008/ 
> IIS7 
> 
> runs as a native windows service? 
> 
> On Sun, May 2, 2010 at 12:46 AM, Dave Searle <[hidden email]>wrote: 
> 
> 
> > Set tomcat6 service to auto start on boot (if running tomat) 
> > 
> > Sent from my iPhone 
> > 
> > On 2 May 2010, at 02:31, "S Ahmed" <[hidden email]> wrote: 
> > 
> > > Hi, 
> > > 
> > > I'm trying to get Solr to run on windows, such that if it reboots 
> > > the Solr 
> > > service will be running. 
> > > 
> > > How can I do this? 
> > 
> 
> 
> 
>   _ 
> 
> View message @ 
> 
>
http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772174.

> html 
> To start a new topic under Solr - User, email 
> [hidden email]<[hidden email]> 
> To unsubscribe from Solr - User, click 
> < (link removed) 
> GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 
> 
> 
> 
> 
> -- 
> View this message in context: 
>
http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772178.
html
> Sent from the Solr - User mailing list archive at Nabble.com. 
> 

 

  _  

View message @
http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772182.
html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772190.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: run on reboot on windows

2010-05-02 Thread caman

Ahmed,

 

Best is if you take a look at the documentation of jetty or tomcat. SOLR can
run on any web container, it's up to you how you  configure your web
container to run

 

Thanks

Aboxy

 

 

 

 

 

From: S Ahmed [via Lucene]
[mailto:ml-node+772174-2097041460-124...@n3.nabble.com] 
Sent: Sunday, May 02, 2010 4:33 PM
To: caman
Subject: Re: run on reboot on windows

 

By default it uses Jetty, so your saying Tomcat on windows server 2008/ IIS7

runs as a native windows service? 

On Sun, May 2, 2010 at 12:46 AM, Dave Searle <[hidden email]>wrote: 


> Set tomcat6 service to auto start on boot (if running tomat) 
> 
> Sent from my iPhone 
> 
> On 2 May 2010, at 02:31, "S Ahmed" <[hidden email]> wrote: 
> 
> > Hi, 
> > 
> > I'm trying to get Solr to run on windows, such that if it reboots 
> > the Solr 
> > service will be running. 
> > 
> > How can I do this? 
> 

 

  _  

View message @
http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772174.
html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/run-on-reboot-on-windows-tp770892p772178.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Only one field in the result

2010-04-28 Thread caman

I think you are looking for 'fl'  param.

 

 

From: pcmanprogrammeur [via Lucene]
[mailto:ml-node+761818-821639313-124...@n3.nabble.com] 
Sent: Wednesday, April 28, 2010 12:38 AM
To: caman
Subject: Only one field in the result

 

Hello, 

In my schema.xml, i have some fields "stored" and "indexed". However, in a
particular case, i would like to get only one field in my XML result ! Is it
possible? 

Thanks for your help ! 

  _  

View message @
http://lucene.472066.n3.nabble.com/Only-one-field-in-the-result-tp761818p761
818.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
GZvcnRoZW90aGVyc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Only-one-field-in-the-result-tp761818p761823.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Problem with DataImportHandler and embedded entities

2010-04-21 Thread caman

What is the unique id set in schema?

 

 

 

From: Jason Rutherglen [via Lucene] 
[mailto:ml-node+740744-1209892083-124...@n3.nabble.com] 
Sent: Wednesday, April 21, 2010 10:56 AM
To: caman
Subject: Re: Problem with DataImportHandler and embedded entities

 

The other issue now is full-import is only importing 1 document, and 
that's all.  Despite no limits etc... Odd... 

On Wed, Apr 21, 2010 at 10:48 AM, Jason Rutherglen 
<[hidden email] 
<http://n3.nabble.com/user/SendEmail.jtp?type=node&node=740744&i=0> > wrote: 


> I think it's working, it was the lack of the seemingly innocuous 
> sub-entity pk="application_id".  After adding that I'm seeing some 
> data returned. 
> 
> On Wed, Apr 21, 2010 at 10:44 AM, Jason Rutherglen 
> <[hidden email] 
> <http://n3.nabble.com/user/SendEmail.jtp?type=node&node=740744&i=1> > wrote: 
>> Something's off, for each row, it's performing the following 5 
>> sub-queries.  Weird.  Below is the updated data-config.xml (compared 
>> to the original email I changed the field from comment to added). 
>> 
>>  
>> --- row #1- 
>> 876 
>> 2009-11-02T06:36:28Z 
>> - 
>> - 
>>  
>> SELECT added FROM ratings WHERE app = 876 
>> SELECT added FROM ratings WHERE app = 876 
>> SELECT added FROM ratings WHERE app = 876 
>> SELECT added FROM ratings WHERE app = 876 
>> SELECT added FROM ratings WHERE app = 876 
>> 0:0:0.0 
>> 0:0:0.0 
>> 0:0:0.0 
>> 0:0:0.0 
>> 0:0:0.0 
>> --- row #1- 
>> 2010-01-26T18:08:53Z 
>> - 
>> --- row #2- 
>> 2010-01-27T20:16:20Z 
>> - 
>> --- row #3- 
>> 2010-01-29T00:02:40Z 
>> - 
>> --- row #4- 
>> 2010-02-01T16:59:42Z 
>> - 
>>  
>>  
>> 
>>  
>>  > driver="com.mysql.jdbc.Driver" url="jdbc:mysql://127.0.0.1:3306/ch" 
>> batchSize="-1" user="ch" password="ch_on_this"/> 
>>   
>>>  query="SELECT id, updated FROM applications limit 10"> 
>>   
>> 
>> 
>>   
>> 
>>   
>>  
>> 
>> On Wed, Apr 21, 2010 at 10:41 AM, caman <[hidden email] 
>> <http://n3.nabble.com/user/SendEmail.jtp?type=node&node=740744&i=2> > wrote: 
>>> 
>>> Hard to tell. 
>>> 
>>> 
>>> 
>>> Did you try putting the child entity part of main query with subquery. 
>>> Don't 
>>> think that is the issue though but worth a try 
>>> 
>>> Select id, updated,( SELECT comment  FROM ratings WHERE app = appParent.id) 
>>> as comment FROM applications appParent limit 10 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> From: Jason Rutherglen [via Lucene] 
>>> [mailto:[hidden email] 
>>> <http://n3.nabble.com/user/SendEmail.jtp?type=node&node=740744&i=3> ] 
>>> Sent: Wednesday, April 21, 2010 10:33 AM 
>>> To: caman 
>>> Subject: Re: Problem with DataImportHandler and embedded entities 
>>> 
>>> 
>>> 
>>> Caman, 
>>> 
>>> I'm storing it.  This is what I see when DataImportHandler verbose is 
>>> turned 
>>> on. 
>>> 
>>> While the field names don't match, I am seeing that sub-queries are 
>>> being performed, data is being returned.  It's just not making it into 
>>> the document. 
>>> 
>>>  
>>> - 
>>>  
>>> - 
>>>  
>>> SELECT id, updated FROM applications limit 10 
>>> 0:0:0.9 
>>> --- row #1- 
>>> 407 
>>> 2009-11-02T06:35:48Z 
>>> - 
>>> - 
>>>  
>>> SELECT added FROM ratings WHERE app = 407 
>>> 0:0:0.8 
>>>  
>>>  
>>> 
>>> On Wed, Apr 21, 2010 at 10:17 AM, caman <[hidden email] 
>>> <http://n3.nabble.com/user/SendEmail.jtp?type=node 
>>> <http://n3.nabble.com/user/SendEmail.jtp?type=node&node=740680&i=0> 
>>> &node=740680&i=0> > wrote: 
>>> 
>>> 
>>> 
>>>> 
>>>> Are you storing the comment field

RE: Problem with DataImportHandler and embedded entities

2010-04-21 Thread caman

Hard to tell.

 

Did you try putting the child entity part of main query with subquery. Don't
think that is the issue though but worth a try

Select id, updated,( SELECT comment  FROM ratings WHERE app = appParent.id)
as comment FROM applications appParent limit 10

 

 

From: Jason Rutherglen [via Lucene]
[mailto:ml-node+740680-1955771337-124...@n3.nabble.com] 
Sent: Wednesday, April 21, 2010 10:33 AM
To: caman
Subject: Re: Problem with DataImportHandler and embedded entities

 

Caman, 

I'm storing it.  This is what I see when DataImportHandler verbose is turned
on. 

While the field names don't match, I am seeing that sub-queries are 
being performed, data is being returned.  It's just not making it into 
the document. 

 
- 
 
- 
 
SELECT id, updated FROM applications limit 10 
0:0:0.9 
--- row #1- 
407 
2009-11-02T06:35:48Z 
- 
- 
 
SELECT added FROM ratings WHERE app = 407 
0:0:0.8 
 
 

On Wed, Apr 21, 2010 at 10:17 AM, caman <[hidden email]
<http://n3.nabble.com/user/SendEmail.jtp?type=node&node=740680&i=0> > wrote:



> 
> Are you storing the comment field or indexing it? 
> 
>   will not appear in the document. 
> 
> 
> 
> From: Jason Rutherglen [via Lucene] 
> [mailto:[hidden email]
<http://n3.nabble.com/user/SendEmail.jtp?type=node&node=740680&i=1> ] 
> Sent: Wednesday, April 21, 2010 10:15 AM 
> To: caman 
> Subject: Problem with DataImportHandler and embedded entities 
> 
> 
> 
> I'm using the following data-config.xml with DataImportHandler.  I've 
> never used embedded entities before however I'm not seeing the comment 
> show up in the document... I'm not sure what's up. 
> 
>  
>   driver="com.mysql.jdbc.Driver" url="jdbc:mysql://127.0.0.1:3306/ch" 
> batchSize="-1" user="ch" password="ch_on_this"/> 
>   
>  query="SELECT id, updated FROM applications limit 10"> 
>   
> 
>   
> 
>   
>  
> 
> 
> 
>  _ 
> 
> View message @ 
>
http://n3.nabble.com/Problem-with-DataImportHandler-and-embedded-entities-tp
> 740624p740624.html 
> To start a new topic under Solr - User, email 
> [hidden email]
<http://n3.nabble.com/user/SendEmail.jtp?type=node&node=740680&i=2>  
> To unsubscribe from Solr - User, click 
> < (link removed) 
> yc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 
> 
> 
> 
> 
> -- 
> View this message in context:
http://n3.nabble.com/Problem-with-DataImportHandler-and-embedded-entities-tp
740624p740634.html
> Sent from the Solr - User mailing list archive at Nabble.com. 
> 

 

  _  

View message @
http://n3.nabble.com/Problem-with-DataImportHandler-and-embedded-entities-tp
740624p740680.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
yc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://n3.nabble.com/Problem-with-DataImportHandler-and-embedded-entities-tp740624p740708.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Problem with DataImportHandler and embedded entities

2010-04-21 Thread caman

Are you storing the comment field or indexing it?

  will not appear in the document.

 

From: Jason Rutherglen [via Lucene]
[mailto:ml-node+740624-966329660-124...@n3.nabble.com] 
Sent: Wednesday, April 21, 2010 10:15 AM
To: caman
Subject: Problem with DataImportHandler and embedded entities

 

I'm using the following data-config.xml with DataImportHandler.  I've 
never used embedded entities before however I'm not seeing the comment 
show up in the document... I'm not sure what's up. 

 
   
   
 
   
 
   
 
   
 



  _  

View message @
http://n3.nabble.com/Problem-with-DataImportHandler-and-embedded-entities-tp
740624p740624.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
yc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://n3.nabble.com/Problem-with-DataImportHandler-and-embedded-entities-tp740624p740634.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: dismax vs the standard query handlers

2010-04-20 Thread caman

Your answers are here. Wiki describes it pretty well

 

http://wiki.apache.org/solr/DisMaxRequestHandler

 

 

 

From: Sandhya Agarwal [via Lucene] 
[mailto:ml-node+739071-961078546-124...@n3.nabble.com] 
Sent: Tuesday, April 20, 2010 9:40 PM
To: caman
Subject: dismax vs the standard query handlers

 

Hello, 

What are the advantages of using the “dismax” query handler vs the “standard” 
query handler.  As I understand, “dismax” queries are parsed differently and 
provide more flexibility w.r.t score boosting etc. Do we have any more reasons 
? 

Thanks, 
Sandhya 



  _  

View message @ 
http://n3.nabble.com/dismax-vs-the-standard-query-handlers-tp739071p739071.html 
To start a new topic under Solr - User, email 
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click < (link removed) >  here. 

 


-- 
View this message in context: 
http://n3.nabble.com/dismax-vs-the-standard-query-handlers-tp739071p739081.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: DIH dataimport.properties with

2010-04-20 Thread caman

Shawn,

 

Is this your custom implementation?

 

"For a delta-import, minDid comes from 
the maxDid value stored after the last successful import.

"

 

Are you updating the dataTable after the import was successful? How did you
handle this? I have similar scenario and your approach will work for my
use-case as well

 

 

thanks

 

 

 

 

 

From: Shawn Heisey-4 [via Lucene]
[mailto:ml-node+738653-1765413222-124...@n3.nabble.com] 
Sent: Tuesday, April 20, 2010 4:35 PM
To: caman
Subject: Re: DIH dataimport.properties with

 

Michael, 

The SolrEntityProcessor looks very intriguing, but it won't work with 
the released 1.4 version.  If that's OK with you and it looks like it'll 
do what you want, feel free to ignore the rest of this. 

I'm also using MySQL as an import source for Solr.  I was unable to use 
the last_index_time because my database doesn't have a field I can match 
against it.  I believe you can use something similar to the method that 
I came up with.  The point of this post is to show you how to inject 
values from outside Solr into a DIH request rather than have Solr 
provide the milestone that indicates new content. 

Here's a simplified version of my URL template and entity configuration 
in data-config.xml.  The did field in my database is an autoincrement 
BIGINT serving as my private key, but something similar could likely be 
cooked up with timestamps too: 

http://HOST:PORT/solr/CORE/dataimport?command=COMMAND
<http://HOST:PORT/solr/CORE/dataimport?command=COMMAND&dataTable=DATATABLE&m
inDid=MINDID&maxDid=MAXDID> &dataTable=DATATABLE&minDid=MINDID&maxDid=MAXDID

 

 
 

 

If I am doing a full-import, I set minDid to zero and maxDid to the 
highest value in the database.  For a delta-import, minDid comes from 
the maxDid value stored after the last successful import. 

The deltaQuery is required, but in my case, is a throw-away query that 
just tells Solr the delta-import needs to be run.  My query and 
deltaImportQuery are identical, though yours may not be. 

Good luck, no matter how you choose to approach this. 

Shawn 


On 4/18/2010 9:02 PM, Michael Tibben wrote: 


> I don't really understand how this will help. Can you elaborate ? 
> 
> Do you mean that the last_index_time can be imported from somewhere 
> outside solr?  But I need to be able to *set* what last_index_time is 
> stored in dataimport.properties, not get properties from somewhere else 
> 
> 
> 
> On 18/04/10 10:02, Lance Norskog wrote: 
>> The SolrEntityProcessor allows you to query a Solr instance and use 
>> the results as DIH properties. You would have to create your own 
>> regular query to do the delta-import instead of using the delta-import 
>> feature. 





  _  

View message @
http://n3.nabble.com/DIH-dataimport-properties-with-tp722924p738653.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
yc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://n3.nabble.com/DIH-dataimport-properties-with-tp722924p738949.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Solr Core Creation

2010-04-20 Thread caman

What was the command executed? 

 

 

From: abhatna...@vantage.com [via Lucene]
[mailto:ml-node+733159-1790924601-124...@n3.nabble.com] 
Sent: Tuesday, April 20, 2010 11:58 AM
To: caman
Subject: Solr Core Creation

 

I tried creating a core on the fly using remote server 

-I am able to query against it however it didn't create any new folder
inside solr home 

is this the expected behavior? 

I tried searching for this topic but couldn't found any good answer. 



-Ankit 

  _  

View message @ http://n3.nabble.com/Solr-Core-Creation-tp733159p733159.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
yc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://n3.nabble.com/Solr-Core-Creation-tp733159p733268.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: CopyField

2010-04-15 Thread caman

As far as I know, No.

But why don't you keep another column 'source_final' and you populate it
with value from sourc1 or sourc2 depending on what has value(Look at
transformer, may be script transformer) . then in schema.xml

  

 

Thanks

James

http://www.click2money.com

 

 

From: Blargy [via Lucene]
[mailto:ml-node+722785-1511121936-124...@n3.nabble.com] 
Sent: Thursday, April 15, 2010 5:54 PM
To: caman
Subject: CopyField

 

Is there anyway to instruct copy field overwrite an existing field, or only
accept the first one? 

   
   

Basically I'm want to copy source1 to dest (if it exists). If source1 doesnt
exist then copy source2 into dest. 

Is this possible? 

  _  

View message @ http://n3.nabble.com/CopyField-tp722785p722785.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
yc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://n3.nabble.com/CopyField-tp722785p722800.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: DIH questions

2010-04-15 Thread caman

I had similar requirement and was not able to figure out at that time. Was
able to use some of the SQL Magic to create concatenated string for
sub-entities  and then process them in transformer which may or may not work
for your use-case. Just a thought. 

Mention specifics here please and I can see if anything can be done

 

Thanks

James

http://www.click2money.com

 

 

From: Blargy [via Lucene]
[mailto:ml-node+722651-1893075853-124...@n3.nabble.com] 
Sent: Thursday, April 15, 2010 4:28 PM
To: caman
Subject: Re: DIH questions

 

Is there anyway that a sub-entity can delete/rewrite fields from the
document? Is there anyway sub-entities can get access to what the documents
current value for a current field? 

  _  

View message @ http://n3.nabble.com/DIH-questions-tp719892p722651.html 
To start a new topic under Solr - User, email
ml-node+472068-464289649-124...@n3.nabble.com 
To unsubscribe from Solr - User, click
< (link removed) 
yc3R1ZmZAZ21haWwuY29tfDQ3MjA2OHwtOTM0OTI1NzEx>  here. 

 


-- 
View this message in context: 
http://n3.nabble.com/DIH-questions-tp719892p722676.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: dynamic categorization & transactional data

2010-03-20 Thread caman

@Grant
Less than a minute.  If we go with the meta-retrieval from the index, we
will have to keep the index updated down to seconds. But that may not scale
well.  Probably a hybrid approach?
I will look into classifier. thanks





Grant Ingersoll-6 wrote:
> 
> 
> On Mar 18, 2010, at 2:44 PM, caman wrote:
> 
>> 
>> 1) Took care of the first one by Transformer.
> 
> This is often also something done by a classifier that is trained to deal
> with all the statistical variations in your text.  Tools like Weka,
> Mahout, OpenNLP, etc. can be applied here.
> 
>> 2) Any input on 2 please? I need to store # of views and popularity with
>> each document and that can change pretty often. Recommended to use
>> database
>> or can this be updated to SOLr directly? My issue with DB is that with
>> every
>> SOLR search hit, will have to do DB hit to retrieve meta-data. 
> 
> Define often, please.  Less than a minute or more than a minute?
> 
>> 
>> Any input is appreciated please
>> 
>> caman wrote:
>>> 
>>> Hello all,
>>> 
>>> Please see below.any help much appreciated.
>>> 1) Extracting data out of a text field to assign a category for certain
>>> configured words. e.g. If the text is "Google does it again with
>>> Android" 
>>> and If 'Google' and 'Android' are the configured words, I want to b able
>>> to assign the article to tags 'Google' and 'Android' and 'Technical' .
>>> Can
>>> I do this with a custom filter during analysis? Similarly setting up
>>> categories for each article based on keywords in the text.
>>> 2) How about using SOLR as transactional datastore? Need to keep track
>>> of
>>> rating for each document. Would 'ExternalFileField' be good choice for
>>> this use-case?
>>> 
>>> Thanks in advance.
>>> 
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/dynamic-categorization---transactional-data-tp27790233p27949786.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
> 
> --
> Grant Ingersoll
> http://www.lucidimagination.com/
> 
> Search the Lucene ecosystem using Solr/Lucene:
> http://www.lucidimagination.com/search
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/dynamic-categorization---transactional-data-tp27790233p27970656.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: dynamic categorization & transactional data

2010-03-18 Thread caman

David,

Much appreciated. This gives me enough to work with. 
I missed one important point. Our data changes pretty frequently which mean
we may be running deltas every 5-10 minutes. in-memory should work
thanks





David Smiley @MITRE.org wrote:
> 
> You'll probably want to influence your relevancy on this popularity number
> that is changing often.  ExternalFileField looks like a possibility though
> I haven't used it.  Another would be using an in-memory cache which stores
> all popularity numbers for any data that has its popularity updated since
> the last index update (say since the previous night).  On second thought,
> it may need to be absolutely all of them but these are just #s so no big
> deal?  You could then customize a "ValueSource" subclass which gets data
> from this fast in-memory up to date source.  See FileFloatSource for an
> example that uses a file instead of an in-memory structure.
> 
> ~ David Smiley
> Author: http://www.packtpub.com/solr-1-4-enterprise-search-server/
> 
> 
> On Mar 18, 2010, at 2:44 PM, caman wrote:
> 
>> 2) Any input on 2 please? I need to store # of views and popularity with
>> each document and that can change pretty often. Recommended to use
>> database
>> or can this be updated to SOLr directly? My issue with DB is that with
>> every
>> SOLR search hit, will have to do DB hit to retrieve meta-data. 
>> 
>> Any input is appreciated please
> 
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/dynamic-categorization---transactional-data-tp27790233p27950036.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: dynamic categorization & transactional data

2010-03-18 Thread caman

1) Took care of the first one by Transformer.
2) Any input on 2 please? I need to store # of views and popularity with
each document and that can change pretty often. Recommended to use database
or can this be updated to SOLr directly? My issue with DB is that with every
SOLR search hit, will have to do DB hit to retrieve meta-data. 

Any input id appreciated please

caman wrote:
> 
> Hello all,
> 
> Please see below.any help much appreciated.
> 1) Extracting data out of a text field to assign a category for certain
> configured words. e.g. If the text is "Google does it again with Android" 
> and If 'Google' and 'Android' are the configured words, I want to b able
> to assign the article to tags 'Google' and 'Android' and 'Technical' . Can
> I do this with a custom filter during analysis? Similarly setting up
> categories for each article based on keywords in the text.
> 2) How about using SOLR as transactional datastore? Need to keep track of
> rating for each document. Would 'ExternalFileField' be good choice for
> this use-case?
> 
> Thanks in advance.
> 

-- 
View this message in context: 
http://old.nabble.com/dynamic-categorization---transactional-data-tp27790233p27949786.html
Sent from the Solr - User mailing list archive at Nabble.com.



dynamic categorization & transactional data

2010-03-04 Thread caman

Hello all,

Please see below.any help much appreciated.
1) Extracting data out of a text field to assign a category for certain
configured words. e.g. If the text is "Google does it again with Android" 
and If 'Google' and 'Android' are the configured words, I want to b able to
assign the article to tags 'Google' and 'Android' and 'Technical' . Can I do
this with a custom filter during analysis? Similarly setting up categories
for each article based on keywords in the text.
2) How about using SOLR as transactional datastore? Need to keep track of
rating for each document. Would 'ExternalFileField' be good choice for this
use-case?

Thanks in advance.
-- 
View this message in context: 
http://old.nabble.com/dynamic-categorization---transactional-data-tp27790233p27790233.html
Sent from the Solr - User mailing list archive at Nabble.com.



SOLR Index or database

2010-03-03 Thread caman

Hello All, 

Just struggling with a thought where SOLR or a database would be good option
for me.Here are my requirements.
We index about 600+ news/blogs into out system. Only information we store
locally is the title,link and article snippet.We are able to index all these
sources into SOLR index and it works perfectly.
This is where is gets tricky: 
We need to store certain meta information as well. e.g.
1. Rating/popularity of article
2. Sharing of the articles between users
3. How may times articles is viewed.
4. Comments on each article.

So far, we are deciding to store meta-information in the database and link
this data with the a document in the index. When user opens the page,
results are combined from index and the database to render the view. 

Any reservation on using the above architecture? 
Is SOLR right fit in this case? We do need full text search so SOLR is
no-brainer imho but would love to hear community view.

Any feedback appreciated

thanks




-- 
View this message in context: 
http://old.nabble.com/SOLR-Index-or-database-tp27772362p27772362.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing an oracle warehouse table

2010-02-03 Thread caman

Thanks. I will give this a shot.

Alexey-34 wrote:
> 
>> What would be the right way to point out which field contains the term
>> searched for.
> I would use highlighting for all of these fields and then post process
> Solr response in order to check highlighting tags. But I don't have so
> many fields usually and don't know if it's possible to configure Solr
> to highlight fields using '*' as dynamic fields.
> 
> On Wed, Feb 3, 2010 at 2:43 AM, caman 
> wrote:
>>
>> Thanks all. I am on track.
>> Another question:
>> What would be the right way to point out which field contains the term
>> searched for.
>> e.g. If I search for SOLR and if the term exist in field788 for a
>> document,
>> how do I pinpoint that which field has the term.
>> I copied all the fields in field called 'body' which makes searching
>> easier
>> but would be nice to show the field which has that exact term.
>>
>> thanks
>>
>> caman wrote:
>>>
>>> Hello all,
>>>
>>> hope someone can point me to right direction. I am trying to index an
>>> oracle warehouse table(TableA) with 850 columns. Out of the structure
>>> about 800 fields are CLOBs and are good candidate to enable full-text
>>> searching. Also have few columns which has relational link to other
>>> tables. I am clean on how to create a root entity and then pull data
>>> from
>>> other relational link as child entities.  Most columns in TableA are
>>> named
>>> as field1,field2...field800.
>>> Now my question is how to organize the schema efficiently:
>>> First option:
>>> if my query is 'select * from TableA', Do I  define >> column="FIELD1" /> for each of those 800 columns?   Seems cumbersome.
>>> May
>>> be can write a script to generate XML instead of handwriting both in
>>> data-config.xml and schema.xml.
>>> OR
>>> Dont define any  so that column in
>>> SOLR will be same as in the database table. But questions are 1)How do I
>>> define unique field in this scenario? 2) How to copy all the text fields
>>> to a common field for easy searching?
>>>
>>> Any helpful is appreciated. Please feel free to suggest any alternative
>>> way.
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27429352.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27439611.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing an oracle warehouse table

2010-02-02 Thread caman

Thanks all. I am on track.
Another question: 
What would be the right way to point out which field contains the term
searched for.
e.g. If I search for SOLR and if the term exist in field788 for a document,
how do I pinpoint that which field has the term.
I copied all the fields in field called 'body' which makes searching easier
but would be nice to show the field which has that exact term.

thanks

caman wrote:
> 
> Hello all,
> 
> hope someone can point me to right direction. I am trying to index an
> oracle warehouse table(TableA) with 850 columns. Out of the structure
> about 800 fields are CLOBs and are good candidate to enable full-text
> searching. Also have few columns which has relational link to other
> tables. I am clean on how to create a root entity and then pull data from
> other relational link as child entities.  Most columns in TableA are named
> as field1,field2...field800.
> Now my question is how to organize the schema efficiently: 
> First option:
> if my query is 'select * from TableA', Do I  define  column="FIELD1" /> for each of those 800 columns?   Seems cumbersome. May
> be can write a script to generate XML instead of handwriting both in
> data-config.xml and schema.xml. 
> OR
> Dont define any  so that column in
> SOLR will be same as in the database table. But questions are 1)How do I
> define unique field in this scenario? 2) How to copy all the text fields
> to a common field for easy searching? 
> 
> Any helpful is appreciated. Please feel free to suggest any alternative
> way.
> 
> Thanks
> 
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27429352.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing a oracle warehouse table

2010-02-02 Thread caman

Alexey,

This is exactly what I was looking for. Thank you thank you thank you ..
Should have read the documentation a little better.
Much appreciated. 

Alexey-34 wrote:
> 
>> Dont define any  so that column in
>> SOLR will be same as in the database table.
> Correct
> You can define dynamic field  indexed="true"  stored="true"/> ( see
> http://wiki.apache.org/solr/SchemaXml#Dynamic_fields )
> 
>> 1)How do I define unique field in this scenario?
> You can create primary key into database or generate it directly in
> Solr ( see "UUID techniques" http://wiki.apache.org/solr/UniqueKey )
> 
>> 2) How to copy all the text fields to a common field for easy searching?
>  ( see
> http://wiki.apache.org/solr/SchemaXml#Copy_Fields )
> 
> 
> On Tue, Feb 2, 2010 at 4:22 AM, caman 
> wrote:
>>
>> Hello all,
>>
>> hope someone can point me to right direction. I am trying to index an
>> oracle
>> warehouse table(TableA) with 850 columns. Out of the structure about 800
>> fields are CLOBs and are good candidate to enable full-text searching.
>> Also
>> have few columns which has relational link to other tables. I am clean on
>> how to create a root entity and then pull data from other relational link
>> as
>> child entities.  Most columns in TableA are named as
>> field1,field2...field800.
>> Now my question is how to organize the schema efficiently:
>> First option:
>> if my query is 'select * from TableA', Do I  define > column="FIELD1" /> for each of those 800 columns?   Seems cumbersome. May
>> be
>> can write a script to generate XML instead of handwriting both in
>> data-config.xml and schema.xml.
>> OR
>> Dont define any  so that column in
>> SOLR will be same as in the database table. But questions are 1)How do I
>> define unique field in this scenario? 2) How to copy all the text fields
>> to
>> a common field for easy searching?
>>
>> Any helpful is appreciated. Please feel free to suggest any alternative
>> way.
>>
>> Thanks
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Indexing-a-oracle-warehouse-table-tp27414263p27414263.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27426206.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing an oracle warehouse table

2010-02-02 Thread caman

Ron,

Much appreciated.  Search requirements are as :
1) Enable search/faceting on author,service,datetime. 
2) Enable full text search on all text column which are named as col1
col800+  -- total of more than 800 columns. 

Here is what I did so far: Defined entities in db schema in db-config.xml
without any column definition in the file, which basically mean is that I
want to keep fields name same as in the database. 
Now in schema.xml : I have  tag for each database field retrieved
with the SQL queries in db-config.xml, which are more than 800+ (did not
write this by hand,wrote a groovy script to generate this for me from the
database)

Multi-valued : Yes, this is what I am using to copy all the fields
col1...col800+ to one multi-valued field. That fileld is set as default for
search.

You are right about going to original data source but then had to take a
different approach. Original source is all XML files which do not follow a
standard schema for the structure.

I hope what I mentioned above makes sense.appreciate the response.





Ron Chan wrote:
> 
> it depends on what the search requirements are, so without knowing the
> details here are some vague pointers 
> 
> you may only need to have fields for the columns you are going to be
> categorizing and searching on, this may be a small subset of the 800 and
> the rest can go into one large field to fulfil the full text search 
> 
> another thing to look into is the multi value fields, this can sometimes
> replace the one-to-many relationships in database 
> 
> also it may sometimes be worth while going to the original data source
> rather than the warehouse table, as this is already flattened and
> denormalised, the flattening and denormalizing will most likely be done a
> different way when solr indexing database type data, highly likely you
> will end up with less rows and less columns in the solr index, as each
> solr document can be seen as "multi-dimensional" 
> 
> 
> - Original Message - 
> From: "caman"  
> To: solr-user@lucene.apache.org 
> Sent: Tuesday, 2 February, 2010 1:23:01 AM 
> Subject: Indexing an oracle warehouse table 
> 
> 
> Hello all, 
> 
> hope someone can point me to right direction. I am trying to index an
> oracle 
> warehouse table(TableA) with 850 columns. Out of the structure about 800 
> fields are CLOBs and are good candidate to enable full-text searching.
> Also 
> have few columns which has relational link to other tables. I am clean on 
> how to create a root entity and then pull data from other relational link
> as 
> child entities. Most columns in TableA are named as 
> field1,field2...field800. 
> Now my question is how to organize the schema efficiently: 
> First option: 
> if my query is 'select * from TableA', Do I define  column="FIELD1" /> for each of those 800 columns? Seems cumbersome. May be 
> can write a script to generate XML instead of handwriting both in 
> data-config.xml and schema.xml. 
> OR 
> Dont define any  so that column in 
> SOLR will be same as in the database table. But questions are 1)How do I 
> define unique field in this scenario? 2) How to copy all the text fields
> to 
> a common field for easy searching? 
> 
> Any helpful is appreciated. Please feel free to suggest any alternative
> way. 
> 
> Thanks 
> 
> 
> 
> 
> 
> -- 
> View this message in context:
> http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27414263.html
>  
> Sent from the Solr - User mailing list archive at Nabble.com. 
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27425156.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing an oracle warehouse table

2010-02-02 Thread caman

Anyone please?


caman wrote:
> 
> Hello all,
> 
> hope someone can point me to right direction. I am trying to index an
> oracle warehouse table(TableA) with 850 columns. Out of the structure
> about 800 fields are CLOBs and are good candidate to enable full-text
> searching. Also have few columns which has relational link to other
> tables. I am clean on how to create a root entity and then pull data from
> other relational link as child entities.  Most columns in TableA are named
> as field1,field2...field800.
> Now my question is how to organize the schema efficiently: 
> First option:
> if my query is 'select * from TableA', Do I  define  column="FIELD1" /> for each of those 800 columns?   Seems cumbersome. May
> be can write a script to generate XML instead of handwriting both in
> data-config.xml and schema.xml. 
> OR
> Dont define any  so that column in
> SOLR will be same as in the database table. But questions are 1)How do I
> define unique field in this scenario? 2) How to copy all the text fields
> to a common field for easy searching? 
> 
> Any helpful is appreciated. Please feel free to suggest any alternative
> way.
> 
> Thanks
> 
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Indexing-an-oracle-warehouse-table-tp27414263p27424327.html
Sent from the Solr - User mailing list archive at Nabble.com.



Indexing a oracle warehouse table

2010-02-01 Thread caman

Hello all,

hope someone can point me to right direction. I am trying to index an oracle
warehouse table(TableA) with 850 columns. Out of the structure about 800
fields are CLOBs and are good candidate to enable full-text searching. Also
have few columns which has relational link to other tables. I am clean on
how to create a root entity and then pull data from other relational link as
child entities.  Most columns in TableA are named as
field1,field2...field800.
Now my question is how to organize the schema efficiently: 
First option:
if my query is 'select * from TableA', Do I  define  for each of those 800 columns?   Seems cumbersome. May be
can write a script to generate XML instead of handwriting both in
data-config.xml and schema.xml. 
OR
Dont define any  so that column in
SOLR will be same as in the database table. But questions are 1)How do I
define unique field in this scenario? 2) How to copy all the text fields to
a common field for easy searching? 

Any helpful is appreciated. Please feel free to suggest any alternative way.

Thanks





-- 
View this message in context: 
http://old.nabble.com/Indexing-a-oracle-warehouse-table-tp27414263p27414263.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Document model suggestion

2009-12-21 Thread caman

Lance,
Makes sense. We are playing around with keeping the security model
completely out of Index. We will filter out results before data display
based on access rights. But approach you suggested is not ruled out
completely.
thanks

Lance Norskog-2 wrote:
> 
> Yes, you would have 'role' as a multi-valued field. When you add
> someone to a role, you don't have to re-index. That's all.
> 
> On Thu, Dec 17, 2009 at 12:55 PM, caman 
> wrote:
>>
>> Are you suggesting that roles should be maintained in the index? We do
>> manage
>> out authentication based on roles but at granular level, user rights play
>> a
>> big role as well.
>> I know we need to compromise, just need to find a balance.
>>
>> Thanks
>>
>>
>> Lance Norskog-2 wrote:
>>>
>>> Role-based authentication is one level of sophistication up from
>>> user-based authentication. Users can have different roles, and
>>> authentication goes against roles. Documents with multiple viewers
>>> would be assigned special roles. All users would also have their own
>>> matching role.
>>>
>>> On Tue, Dec 15, 2009 at 10:01 AM, caman 
>>> wrote:
>>>>
>>>> Erick,
>>>> I know what you mean.
>>>> Wonder if it is actually cleaner to keep the authorization  model out
>>>> of
>>>> solr index and filter the data at client side based on the user access
>>>> rights.
>>>> Thanks all for help.
>>>>
>>>>
>>>>
>>>> Erick Erickson wrote:
>>>>>
>>>>> Yes, that should work. One hard part is what happens if your
>>>>> authorization model has groups, especially when membership
>>>>> in those groups changes. Then you have to go in and update
>>>>> all the affected docs.
>>>>>
>>>>> FWIW
>>>>> Erick
>>>>>
>>>>> On Tue, Dec 15, 2009 at 12:24 PM, caman
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> Shalin,
>>>>>>
>>>>>> Thanks. much appreciated.
>>>>>> Question about:
>>>>>>  "That is usually what people do. The hard part is when some
>>>>>> documents
>>>>>> are
>>>>>> shared across multiple users. "
>>>>>>
>>>>>> What do you recommend when documents has to be shared across multiple
>>>>>> users?
>>>>>> Can't I just multivalue a field with all the users who has access to
>>>>>> the
>>>>>> document?
>>>>>>
>>>>>>
>>>>>> thanks
>>>>>>
>>>>>> Shalin Shekhar Mangar wrote:
>>>>>> >
>>>>>> > On Tue, Dec 15, 2009 at 7:26 AM, caman
>>>>>> > wrote:
>>>>>> >
>>>>>> >>
>>>>>> >> Appreciate any guidance here please. Have a master-child table
>>>>>> between
>>>>>> >> two
>>>>>> >> tables 'TA' and 'TB' where form is the master table. Any row in TA
>>>>>> can
>>>>>> >> have
>>>>>> >> multiple row in TB.
>>>>>> >> e.g. row in TA
>>>>>> >>
>>>>>> >> id---name
>>>>>> >> 1---tweets
>>>>>> >>
>>>>>> >> TB:
>>>>>> >> id|ta_id|field0|field1|field2.|field20|created_by
>>>>>> >> 1|1|value1|value2|value2.|value20|User1
>>>>>> >>
>>>>>> >> 
>>>>>> >
>>>>>> >>
>>>>>> >> This works fine and index the data.But all the data for a row in
>>>>>> TA
>>>>>> gets
>>>>>> >> combined in one document(not desirable).
>>>>>> >> I am not clear on how to
>>>>>> >>
>>>>>> >> 1) separate a particular row from the search results.
>>>>>> >> e.g. If I search for 'Android' and there are 5 rows for android in
>>>>>> TB
>>>>>> for
>>>>>> >> a
>>>>>> >> particular instance in TA, would like to show them separately to
>>>>>&

Re: Document model suggestion

2009-12-17 Thread caman

Are you suggesting that roles should be maintained in the index? We do manage
out authentication based on roles but at granular level, user rights play a
big role as well.
I know we need to compromise, just need to find a balance.

Thanks


Lance Norskog-2 wrote:
> 
> Role-based authentication is one level of sophistication up from
> user-based authentication. Users can have different roles, and
> authentication goes against roles. Documents with multiple viewers
> would be assigned special roles. All users would also have their own
> matching role.
> 
> On Tue, Dec 15, 2009 at 10:01 AM, caman 
> wrote:
>>
>> Erick,
>> I know what you mean.
>> Wonder if it is actually cleaner to keep the authorization  model out of
>> solr index and filter the data at client side based on the user access
>> rights.
>> Thanks all for help.
>>
>>
>>
>> Erick Erickson wrote:
>>>
>>> Yes, that should work. One hard part is what happens if your
>>> authorization model has groups, especially when membership
>>> in those groups changes. Then you have to go in and update
>>> all the affected docs.
>>>
>>> FWIW
>>> Erick
>>>
>>> On Tue, Dec 15, 2009 at 12:24 PM, caman
>>> wrote:
>>>
>>>>
>>>> Shalin,
>>>>
>>>> Thanks. much appreciated.
>>>> Question about:
>>>>  "That is usually what people do. The hard part is when some documents
>>>> are
>>>> shared across multiple users. "
>>>>
>>>> What do you recommend when documents has to be shared across multiple
>>>> users?
>>>> Can't I just multivalue a field with all the users who has access to
>>>> the
>>>> document?
>>>>
>>>>
>>>> thanks
>>>>
>>>> Shalin Shekhar Mangar wrote:
>>>> >
>>>> > On Tue, Dec 15, 2009 at 7:26 AM, caman
>>>> > wrote:
>>>> >
>>>> >>
>>>> >> Appreciate any guidance here please. Have a master-child table
>>>> between
>>>> >> two
>>>> >> tables 'TA' and 'TB' where form is the master table. Any row in TA
>>>> can
>>>> >> have
>>>> >> multiple row in TB.
>>>> >> e.g. row in TA
>>>> >>
>>>> >> id---name
>>>> >> 1---tweets
>>>> >>
>>>> >> TB:
>>>> >> id|ta_id|field0|field1|field2.|field20|created_by
>>>> >> 1|1|value1|value2|value2.|value20|User1
>>>> >>
>>>> >> 
>>>> >
>>>> >>
>>>> >> This works fine and index the data.But all the data for a row in TA
>>>> gets
>>>> >> combined in one document(not desirable).
>>>> >> I am not clear on how to
>>>> >>
>>>> >> 1) separate a particular row from the search results.
>>>> >> e.g. If I search for 'Android' and there are 5 rows for android in
>>>> TB
>>>> for
>>>> >> a
>>>> >> particular instance in TA, would like to show them separately to
>>>> user
>>>> and
>>>> >> if
>>>> >> the user click on any of the row,point them to an attached URL in
>>>> the
>>>> >> application. Should a separate index be maintained for each row in
>>>> TB?TB
>>>> >> can
>>>> >> have millions of rows.
>>>> >>
>>>> >
>>>> > The easy answer is that whatever you want to show as results should
>>>> be
>>>> the
>>>> > thing that you index as documents. So if you want to show tweets as
>>>> > results,
>>>> > one document should represent one tweet.
>>>> >
>>>> > Solr is different from relational databases and you should not think
>>>> about
>>>> > both the same way. De-normalization is the way to go in Solr.
>>>> >
>>>> >
>>>> >> 2) How to protect one user's data from another user. I guess I can
>>>> keep
>>>> a
>>>> >> column for a user_id in the schema and append that filter
>>>> automatically
>>>> >> when
>>>> >> I search through SOLR. Any better alternatives?
>>>> >>
>>>> >>
>>>> > That is usually what people do. The hard part is when some documents
>>>> are
>>>> > shared across multiple users.
>>>> >
>>>> >
>>>> >> Bear with me if these are newbie questions please, this is my first
>>>> day
>>>> >> with
>>>> >> SOLR.
>>>> >>
>>>> >>
>>>> > No problem. Welcome to Solr!
>>>> >
>>>> > --
>>>> > Regards,
>>>> > Shalin Shekhar Mangar.
>>>> >
>>>> >
>>>>
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/Document-model-suggestion-tp26784346p26798445.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Document-model-suggestion-tp26784346p26799016.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> Lance Norskog
> goks...@gmail.com
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Document-model-suggestion-tp26784346p26834798.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Document model suggestion

2009-12-15 Thread caman

Erick,
I know what you mean. 
Wonder if it is actually cleaner to keep the authorization  model out of
solr index and filter the data at client side based on the user access
rights. 
Thanks all for help.



Erick Erickson wrote:
> 
> Yes, that should work. One hard part is what happens if your
> authorization model has groups, especially when membership
> in those groups changes. Then you have to go in and update
> all the affected docs.
> 
> FWIW
> Erick
> 
> On Tue, Dec 15, 2009 at 12:24 PM, caman
> wrote:
> 
>>
>> Shalin,
>>
>> Thanks. much appreciated.
>> Question about:
>>  "That is usually what people do. The hard part is when some documents
>> are
>> shared across multiple users. "
>>
>> What do you recommend when documents has to be shared across multiple
>> users?
>> Can't I just multivalue a field with all the users who has access to the
>> document?
>>
>>
>> thanks
>>
>> Shalin Shekhar Mangar wrote:
>> >
>> > On Tue, Dec 15, 2009 at 7:26 AM, caman
>> > wrote:
>> >
>> >>
>> >> Appreciate any guidance here please. Have a master-child table between
>> >> two
>> >> tables 'TA' and 'TB' where form is the master table. Any row in TA can
>> >> have
>> >> multiple row in TB.
>> >> e.g. row in TA
>> >>
>> >> id---name
>> >> 1---tweets
>> >>
>> >> TB:
>> >> id|ta_id|field0|field1|field2.|field20|created_by
>> >> 1|1|value1|value2|value2.|value20|User1
>> >>
>> >> 
>> >
>> >>
>> >> This works fine and index the data.But all the data for a row in TA
>> gets
>> >> combined in one document(not desirable).
>> >> I am not clear on how to
>> >>
>> >> 1) separate a particular row from the search results.
>> >> e.g. If I search for 'Android' and there are 5 rows for android in TB
>> for
>> >> a
>> >> particular instance in TA, would like to show them separately to user
>> and
>> >> if
>> >> the user click on any of the row,point them to an attached URL in the
>> >> application. Should a separate index be maintained for each row in
>> TB?TB
>> >> can
>> >> have millions of rows.
>> >>
>> >
>> > The easy answer is that whatever you want to show as results should be
>> the
>> > thing that you index as documents. So if you want to show tweets as
>> > results,
>> > one document should represent one tweet.
>> >
>> > Solr is different from relational databases and you should not think
>> about
>> > both the same way. De-normalization is the way to go in Solr.
>> >
>> >
>> >> 2) How to protect one user's data from another user. I guess I can
>> keep
>> a
>> >> column for a user_id in the schema and append that filter
>> automatically
>> >> when
>> >> I search through SOLR. Any better alternatives?
>> >>
>> >>
>> > That is usually what people do. The hard part is when some documents
>> are
>> > shared across multiple users.
>> >
>> >
>> >> Bear with me if these are newbie questions please, this is my first
>> day
>> >> with
>> >> SOLR.
>> >>
>> >>
>> > No problem. Welcome to Solr!
>> >
>> > --
>> > Regards,
>> > Shalin Shekhar Mangar.
>> >
>> >
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Document-model-suggestion-tp26784346p26798445.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Document-model-suggestion-tp26784346p26799016.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Document model suggestion

2009-12-15 Thread caman

Shalin,

Thanks. much appreciated.
Question about: 
 "That is usually what people do. The hard part is when some documents are
shared across multiple users. "

What do you recommend when documents has to be shared across multiple users?
Can't I just multivalue a field with all the users who has access to the
document?


thanks

Shalin Shekhar Mangar wrote:
> 
> On Tue, Dec 15, 2009 at 7:26 AM, caman
> wrote:
> 
>>
>> Appreciate any guidance here please. Have a master-child table between
>> two
>> tables 'TA' and 'TB' where form is the master table. Any row in TA can
>> have
>> multiple row in TB.
>> e.g. row in TA
>>
>> id---name
>> 1---tweets
>>
>> TB:
>> id|ta_id|field0|field1|field2.|field20|created_by
>> 1|1|value1|value2|value2.|value20|User1
>>
>> 
> 
>>
>> This works fine and index the data.But all the data for a row in TA gets
>> combined in one document(not desirable).
>> I am not clear on how to
>>
>> 1) separate a particular row from the search results.
>> e.g. If I search for 'Android' and there are 5 rows for android in TB for
>> a
>> particular instance in TA, would like to show them separately to user and
>> if
>> the user click on any of the row,point them to an attached URL in the
>> application. Should a separate index be maintained for each row in TB?TB
>> can
>> have millions of rows.
>>
> 
> The easy answer is that whatever you want to show as results should be the
> thing that you index as documents. So if you want to show tweets as
> results,
> one document should represent one tweet.
> 
> Solr is different from relational databases and you should not think about
> both the same way. De-normalization is the way to go in Solr.
> 
> 
>> 2) How to protect one user's data from another user. I guess I can keep a
>> column for a user_id in the schema and append that filter automatically
>> when
>> I search through SOLR. Any better alternatives?
>>
>>
> That is usually what people do. The hard part is when some documents are
> shared across multiple users.
> 
> 
>> Bear with me if these are newbie questions please, this is my first day
>> with
>> SOLR.
>>
>>
> No problem. Welcome to Solr!
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Document-model-suggestion-tp26784346p26798445.html
Sent from the Solr - User mailing list archive at Nabble.com.



Document model suggestion

2009-12-14 Thread caman

Appreciate any guidance here please. Have a master-child table between two
tables 'TA' and 'TB' where form is the master table. Any row in TA can have
multiple row in TB.
e.g. row in TA 

id---name
1---tweets

TB:
id|ta_id|field0|field1|field2.|field20|created_by
1|1|value1|value2|value2.|value20|User1

This is how I am trying to model this in SOLR







 
 









This works fine and index the data.But all the data for a row in TA gets
combined in one document(not desirable).
I am not clear on how to 

1) separate a particular row from the search results. 
e.g. If I search for 'Android' and there are 5 rows for android in TB for a
particular instance in TA, would like to show them separately to user and if
the user click on any of the row,point them to an attached URL in the
application. Should a separate index be maintained for each row in TB?TB can
have millions of rows.
2) How to protect one user's data from another user. I guess I can keep a
column for a user_id in the schema and append that filter automatically when
I search through SOLR. Any better alternatives? 

Bear with me if these are newbie questions please, this is my first day with
SOLR.


Thanks

-- 
View this message in context: 
http://old.nabble.com/Document-model-suggestion-tp26784346p26784346.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: An issue with using Solr Cell and multiple files

2009-09-10 Thread caman

You are right. 
I got into same thing. Windows curl gave me error but cygwin ran without any
issues.

thanks


Lance Norskog-2 wrote:
> 
> It is a windows problem (or curl, whatever).  This works with
> double-quotes.
> 
> C:\Users\work\Downloads>\cygwin\home\work\curl-7.19.4\curl.exe
> http://localhost:8983/solr/update --data-binary "" -H
> "Content-type:text/xml; charset=utf-8"
> Single-quotes inside double-quotes should work: " waitFlush='false'/>"
> 
> 
> On Tue, Sep 8, 2009 at 11:59 AM, caman
> wrote:
> 
>>
>> seems to be an error with curl
>>
>>
>>
>>
>> Kevin Miller-17 wrote:
>> >
>> > I am getting the same error message.  I am running Solr on a Windows
>> > machine.  Is the commit command a curl command or is it a Solr command?
>> >
>> >
>> > Kevin Miller
>> > Web Services
>> >
>> > -Original Message-
>> > From: Grant Ingersoll [mailto:gsing...@apache.org]
>> > Sent: Tuesday, September 08, 2009 12:52 PM
>> > To: solr-user@lucene.apache.org
>> > Subject: Re: An issue with  using Solr Cell and multiple files
>> >
>> > solr/examples/exampledocs/post.sh does:
>> > curl $URL --data-binary '' -H 'Content-type:text/xml;
>> > charset=utf-8'
>> >
>> > Not sure if that helps or how it compares to the book.
>> >
>> > On Sep 8, 2009, at 1:48 PM, Kevin Miller wrote:
>> >
>> >> I am using the Solr nightly build from 8/11/2009.  I am able to index
>> >> my documents using the Solr Cell but when I attempt to send the commit
>> >
>> >> command I get an error.  I am using the example found in the Solr 1.4
>> >> Enterprise Search Server book (recently released) found on page 84.
>> >> It
>> >> shows to commit the changes as follows (I am showing where my files
>> >> are located not the example in the book):
>> >>
>> >>>> c:\curl\bin\curl http://echo12:8983/solr/update/ -H "Content-Type:
>> >> text/xml" --data-binary ''
>> >>
>> >> this give me this error: The system cannot find the file specified.
>> >>
>> >> I get the same error when I modify it to look like the following:
>> >>
>> >>>> c:\curl\bin\curl http://echo12:8983/solr/update/ '> >> waitFlush="false"/>'
>> >>>> c:\curl\bin\curl "http://echo12:8983/solr/update/"; -H "Content-Type:
>> >> text/xml" --data-binary ''
>> >>>> c:\curl\bin\curl http://echo12:8983/solr/update/ ''
>> >>>> c:\curl\bin\curl "http://echo12:8983/solr/update/"; ''
>> >>
>> >> I am using the example configuration in Solr so my documents are found
>> >
>> >> in the exampledocs folder also my curl program in located in the root
>> >> directory which is the reason for the way the curl command is being
>> >> executed.
>> >>
>> >> I would appreciate any information on where to look or how to get the
>> >> commit command to execute after indexing multiple files.
>> >>
>> >> Kevin Miller
>> >> Oklahoma Tax Commission
>> >> Web Services
>> >
>> > --
>> > Grant Ingersoll
>> > http://www.lucidimagination.com/
>> >
>> > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
>> > using Solr/Lucene:
>> > http://www.lucidimagination.com/search
>> >
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/An-issue-with-%3Ccommit-%3E-using-Solr-Cell-and-multiple-files-tp25350995p25352122.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Lance Norskog
> goks...@gmail.com
> 
> 

-- 
View this message in context: 
http://www.nabble.com/An-issue-with-%3Ccommit-%3E-using-Solr-Cell-and-multiple-files-tp25350995p25394203.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: An issue with using Solr Cell and multiple files

2009-09-08 Thread caman

seems to be an error with curl




Kevin Miller-17 wrote:
> 
> I am getting the same error message.  I am running Solr on a Windows
> machine.  Is the commit command a curl command or is it a Solr command? 
> 
> 
> Kevin Miller
> Web Services
> 
> -Original Message-
> From: Grant Ingersoll [mailto:gsing...@apache.org] 
> Sent: Tuesday, September 08, 2009 12:52 PM
> To: solr-user@lucene.apache.org
> Subject: Re: An issue with  using Solr Cell and multiple files
> 
> solr/examples/exampledocs/post.sh does:
> curl $URL --data-binary '' -H 'Content-type:text/xml;
> charset=utf-8'
> 
> Not sure if that helps or how it compares to the book.
> 
> On Sep 8, 2009, at 1:48 PM, Kevin Miller wrote:
> 
>> I am using the Solr nightly build from 8/11/2009.  I am able to index 
>> my documents using the Solr Cell but when I attempt to send the commit
> 
>> command I get an error.  I am using the example found in the Solr 1.4
>> Enterprise Search Server book (recently released) found on page 84.   
>> It
>> shows to commit the changes as follows (I am showing where my files 
>> are located not the example in the book):
>>
 c:\curl\bin\curl http://echo12:8983/solr/update/ -H "Content-Type:
>> text/xml" --data-binary ''
>>
>> this give me this error: The system cannot find the file specified.
>>
>> I get the same error when I modify it to look like the following:
>>
 c:\curl\bin\curl http://echo12:8983/solr/update/ '> waitFlush="false"/>'
 c:\curl\bin\curl "http://echo12:8983/solr/update/"; -H "Content-Type:
>> text/xml" --data-binary ''
 c:\curl\bin\curl http://echo12:8983/solr/update/ ''
 c:\curl\bin\curl "http://echo12:8983/solr/update/"; ''
>>
>> I am using the example configuration in Solr so my documents are found
> 
>> in the exampledocs folder also my curl program in located in the root 
>> directory which is the reason for the way the curl command is being 
>> executed.
>>
>> I would appreciate any information on where to look or how to get the 
>> commit command to execute after indexing multiple files.
>>
>> Kevin Miller
>> Oklahoma Tax Commission
>> Web Services
> 
> --
> Grant Ingersoll
> http://www.lucidimagination.com/
> 
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
> using Solr/Lucene:
> http://www.lucidimagination.com/search
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/An-issue-with-%3Ccommit-%3E-using-Solr-Cell-and-multiple-files-tp25350995p25352122.html
Sent from the Solr - User mailing list archive at Nabble.com.