Re: why don't we have a forum for discussion?

2009-02-20 Thread Gunnar Wagenknecht
Martin Lamothe schrieb:
 This mailing list overloads my poor BB curve.

You can configure BIS/BES to not deliver mailing list email to your device.

Note, that this mailing list is already as a newsgroup via NNTP today.
No need to subscribe. Just get a NNTP news reader (eg. Mozilla
Thunderbird). :)

news://news.gmane.org/gmane.comp.jakarta.lucene.solr.user

-Gunnar

-- 
Gunnar Wagenknecht
gun...@wagenknecht.org
http://wagenknecht.org/



Field Boosting Code

2009-02-20 Thread dabboo

Hi,

I was looking into the Solr code and was trying to figure out as where the
code for field boosting is written. I am specifically looking for classes,
which gets called for that functionality.

If somebody knows as where the code is, it will be of great help.

Thanks,
Amit Garg
-- 
View this message in context: 
http://www.nabble.com/Field-Boosting-Code-tp22118997p22118997.html
Sent from the Solr - User mailing list archive at Nabble.com.



Boosting Code

2009-02-20 Thread dabboo

Hi,

Can anyone please tell me where I can find the actual logic/implementation
of field boosting in Solr. I am looking for classes.

Thanks,
Amit Garg
-- 
View this message in context: 
http://www.nabble.com/Boosting-Code-tp22119017p22119017.html
Sent from the Solr - User mailing list archive at Nabble.com.



Retrieve last indexed documents...

2009-02-20 Thread Pierre-Yves LANDRON

Hello everybody,

I suppose this is a very common question, and I'm sorry if it has been answered 
before : How can I retrieve the last indexed documents (I use a timestamp field 
defined as field name=timestamp type=date indexed=true stored=true 
default=NOW multiValued=false/) ? 

Thanks,
Pierre Landron

_
Show them the way! Add maps and directions to your party invites. 
http://www.microsoft.com/windows/windowslive/products/events.aspx

Re: Field Boosting Code

2009-02-20 Thread Grant Ingersoll
It's in Lucene.  See the Field class.  Assuming you mean boosting the  
Field at index time and not boosting the term (text + field name) at  
query time.


On Feb 20, 2009, at 6:26 AM, dabboo wrote:



Hi,

I was looking into the Solr code and was trying to figure out as  
where the
code for field boosting is written. I am specifically looking for  
classes,

which gets called for that functionality.

If somebody knows as where the code is, it will be of great help.

Thanks,
Amit Garg
--
View this message in context: 
http://www.nabble.com/Field-Boosting-Code-tp22118997p22118997.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Rui Pereira
Hello all!
I'm trying to add jdbc entities to Solr in runtime. I can update
data-config.xml and reload the file using the reload-config command, but I
wanted to make the first index on the new entities (not full-index), that
is, add to index the data given by the query in the new entities.
How can I manage to do this?

Thanks in advance.


Re: Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 5:44 PM, Rui Pereira ruipereira...@gmail.comwrote:

 Hello all!
 I'm trying to add jdbc entities to Solr in runtime. I can update
 data-config.xml and reload the file using the reload-config command, but I
 wanted to make the first index on the new entities (not full-index), that
 is, add to index the data given by the query in the new entities.
 How can I manage to do this?


You can use 'entity=changed_entity_1entity=changed_entity_2' when
calling full-import to import only the specified entities.

-- 
Regards,
Shalin Shekhar Mangar.


delta-import not giving updated records

2009-02-20 Thread con

Hi alll

I am trying to run delta-import. For this I am having the below
data-config.xml

dataConfig
dataSource type=JdbcDataSource 
driver=oracle.jdbc.driver.OracleDriver
url=*** user= password=*/
document  
entity name=users transformer=TemplateTransformer 
pk=USER_ID 
query=select USERS.USER_ID, USERS.USER_NAME, 
USERS.CREATED_TIMESTAMP
FROM USERS, CUSTOMERS where USERS.USER_ID = CUSTOMERS.USER_ID 

deltaquery=select USERS.USER_ID, USERS.USER_NAME,
USERS.CREATED_TIMESTAMP FROM USERS, CUSTOMERS where USERS.USER_ID =
CUSTOMERS.USER_ID 
field column=rowtype template=users /
/entity 
/document
/dataConfig

But nothing is happening when i call
http://localhost:8080/solr/users/dataimport?command=delta-import. Whereas
the dataimport.properties is getting updated with the time at which
delta-import is run.

Where as http://localhost:8080/solr/users/dataimport?command=full-import is
properly inserting data.

Can anybody suggest what is wrong with this configuration.

Thanks
con


-- 
View this message in context: 
http://www.nabble.com/delta-import-not-giving-updated-records-tp22120184p22120184.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: delta-import not giving updated records

2009-02-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
there is a very good chance that the query created by DIH is wrong.
try giving the 'deltaImportQuery' explicitly in the entity .

On Fri, Feb 20, 2009 at 6:48 PM, con convo...@gmail.com wrote:

 Hi alll

 I am trying to run delta-import. For this I am having the below
 data-config.xml

 dataConfig
dataSource type=JdbcDataSource 
 driver=oracle.jdbc.driver.OracleDriver
 url=*** user= password=*/
document
entity name=users transformer=TemplateTransformer 
 pk=USER_ID
query=select USERS.USER_ID, USERS.USER_NAME, 
 USERS.CREATED_TIMESTAMP
 FROM USERS, CUSTOMERS where USERS.USER_ID = CUSTOMERS.USER_ID

deltaquery=select USERS.USER_ID, USERS.USER_NAME,
 USERS.CREATED_TIMESTAMP FROM USERS, CUSTOMERS where USERS.USER_ID =
 CUSTOMERS.USER_ID 
field column=rowtype template=users /
/entity
/document
 /dataConfig

 But nothing is happening when i call
 http://localhost:8080/solr/users/dataimport?command=delta-import. Whereas
 the dataimport.properties is getting updated with the time at which
 delta-import is run.

 Where as http://localhost:8080/solr/users/dataimport?command=full-import is
 properly inserting data.

 Can anybody suggest what is wrong with this configuration.

 Thanks
 con


 --
 View this message in context: 
 http://www.nabble.com/delta-import-not-giving-updated-records-tp22120184p22120184.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
--Noble Paul


Re: delta-import not giving updated records

2009-02-20 Thread Shalin Shekhar Mangar
1. There is no closing quote in transformer=TemplateTransformer
2. Attribute names are case-sensitive so it should be deltaQuery instead of
deltaquery

On Fri, Feb 20, 2009 at 6:48 PM, con convo...@gmail.com wrote:


 Hi alll

 I am trying to run delta-import. For this I am having the below
 data-config.xml

 dataConfig
dataSource type=JdbcDataSource
 driver=oracle.jdbc.driver.OracleDriver
 url=*** user= password=*/
document
entity name=users
 transformer=TemplateTransformer pk=USER_ID
query=select USERS.USER_ID, USERS.USER_NAME,
 USERS.CREATED_TIMESTAMP
 FROM USERS, CUSTOMERS where USERS.USER_ID = CUSTOMERS.USER_ID

deltaquery=select USERS.USER_ID, USERS.USER_NAME,
 USERS.CREATED_TIMESTAMP FROM USERS, CUSTOMERS where USERS.USER_ID =
 CUSTOMERS.USER_ID 
field column=rowtype template=users /
/entity
/document
 /dataConfig

 But nothing is happening when i call
 http://localhost:8080/solr/users/dataimport?command=delta-import. Whereas
 the dataimport.properties is getting updated with the time at which
 delta-import is run.

 Where as http://localhost:8080/solr/users/dataimport?command=full-importis
 properly inserting data.

 Can anybody suggest what is wrong with this configuration.

 Thanks
 con


 --
 View this message in context:
 http://www.nabble.com/delta-import-not-giving-updated-records-tp22120184p22120184.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Shalin Shekhar Mangar.


Re: Retrieve last indexed documents...

2009-02-20 Thread Otis Gospodnetic

Pierre,

This is the issue to watch: https://issues.apache.org/jira/browse/SOLR-1023

I don't think there is a super nice way to do that currently.  You could use 
the match-all query (*:*) and sort by timestamp desc, and use start=0rows=1.  
Using a raw timestamp that includes milliseconds is not recommended unless you 
really need milliseconds.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Pierre-Yves LANDRON pland...@hotmail.com
 To: solr-user@lucene.apache.org
 Sent: Friday, February 20, 2009 8:04:28 PM
 Subject: Retrieve last indexed documents...
 
 
 Hello everybody,
 
 I suppose this is a very common question, and I'm sorry if it has been 
 answered 
 before : How can I retrieve the last indexed documents (I use a timestamp 
 field 
 defined as 
 default=NOW multiValued=false/) ? 
 
 Thanks,
 Pierre Landron
 
 _
 Show them the way! Add maps and directions to your party invites. 
 http://www.microsoft.com/windows/windowslive/products/events.aspx



Re: Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Rui Pereira
Only one more question: doesn't full-import deletes all records before
execution, or in this case only deletes the entities passed in the url?

Thanks in advance,
Rui Pereira


On Fri, Feb 20, 2009 at 1:07 PM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 On Fri, Feb 20, 2009 at 5:44 PM, Rui Pereira ruipereira...@gmail.com
 wrote:

  Hello all!
  I'm trying to add jdbc entities to Solr in runtime. I can update
  data-config.xml and reload the file using the reload-config command, but
 I
  wanted to make the first index on the new entities (not full-index), that
  is, add to index the data given by the query in the new entities.
  How can I manage to do this?
 

 You can use 'entity=changed_entity_1entity=changed_entity_2' when
 calling full-import to import only the specified entities.

 --
 Regards,
 Shalin Shekhar Mangar.



RE: Retrieve last indexed documents...

2009-02-20 Thread Pierre-Yves LANDRON

OK, thanks,

That's what i've done ; I've kind of hoped that there was a nicer way to go, 
but after all, it works that way anyway...

Cheers,
P Landron

 Date: Fri, 20 Feb 2009 06:05:24 -0800
 From: otis_gospodne...@yahoo.com
 Subject: Re: Retrieve last indexed documents...
 To: solr-user@lucene.apache.org
 
 
 Pierre,
 
 This is the issue to watch: https://issues.apache.org/jira/browse/SOLR-1023
 
 I don't think there is a super nice way to do that currently.  You could use 
 the match-all query (*:*) and sort by timestamp desc, and use start=0rows=1. 
  Using a raw timestamp that includes milliseconds is not recommended unless 
 you really need milliseconds.
 
 
 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
  From: Pierre-Yves LANDRON pland...@hotmail.com
  To: solr-user@lucene.apache.org
  Sent: Friday, February 20, 2009 8:04:28 PM
  Subject: Retrieve last indexed documents...
  
  
  Hello everybody,
  
  I suppose this is a very common question, and I'm sorry if it has been 
  answered 
  before : How can I retrieve the last indexed documents (I use a timestamp 
  field 
  defined as 
  default=NOW multiValued=false/) ? 
  
  Thanks,
  Pierre Landron
  
  _
  Show them the way! Add maps and directions to your party invites. 
  http://www.microsoft.com/windows/windowslive/products/events.aspx
 

_
Invite your mail contacts to join your friends list with Windows Live Spaces. 
It's easy!
http://spaces.live.com/spacesapi.aspx?wx_action=createwx_url=/friends.aspxmkt=en-us

concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese

Hey there,
I am indexing 3 cores concurrently from 3 diferent mysql tables (I do it
every 5 minutes with a cron job).
The three cores use JdbcDataSource as datasource in data-config.xml
Reached a point, the core that fetches more mysql rows starts running so so
solw until the thread seems to stop (but the other tow keep working
fine)...but java doesn't throw and exception...
I am using a nightly from early january. I found someone experienced the
same problem and uploaded a templateString patch to make it thread-save.

http://www.nabble.com/Concurrency-problem-with-delta-import-td21665540.html#a21665540

The thing is even with this, the problem doesn't disapear.
Does someone knows what is happening??
Thank you.
-- 
View this message in context: 
http://www.nabble.com/concurrency-problem-with-delta-import-%28indexing-various-cores-simultaniously%29-tp22120430p22120430.html
Sent from the Solr - User mailing list archive at Nabble.com.



Defining shards in solrconfig with multiple cores

2009-02-20 Thread jdleider

Hey All,

I am trying to load balance two solr installations, solr1 and solr2. Each
box is running 4 cores, core0 - core3. I would like to define the shards for
each box in solrconfig as such:

lst name=defaults
 str
name=shardssolr1:8080/solr/core0,solr1:8080/solr/core1,solr1:8080/solr/core2,solr1:8080/solr/core3/str
 
/lst

For whatever reason the /admin works. However when i try to /select using
this shards param in the solrconfig.xml the query just hangs. Ive looked
everywhere trying to figure this one out and the syntax looks right. The
query works as it is supposed to when the shards param is removed from
solrconfig.xml and appended to the url. However, I cant use the load
balancer if i have to specify the shards host in the url. 

Am I doing something wrong or is this not supported yet? Is there a
workaround that I can use?

Thanks!

Justin

-- 
View this message in context: 
http://www.nabble.com/Defining-shards-in-solrconfig-with-multiple-cores-tp22120446p22120446.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 8:41 PM, Marc Sturlese marc.sturl...@gmail.comwrote:


 Hey there,
 I am indexing 3 cores concurrently from 3 diferent mysql tables (I do it
 every 5 minutes with a cron job).
 The three cores use JdbcDataSource as datasource in data-config.xml
 Reached a point, the core that fetches more mysql rows starts running so so
 solw until the thread seems to stop (but the other tow keep working
 fine)...but java doesn't throw and exception...
 I am using a nightly from early january. I found someone experienced the
 same problem and uploaded a templateString patch to make it thread-save.


Marc, I'd strongly recommend using a more recent nightly build. There was
another problem related to unsafe usage of SimpleDateFormat which was fixed
recently.

See https://issues.apache.org/jira/browse/SOLR-1017 (which was fixed on 11th
Feb)
-- 
Regards,
Shalin Shekhar Mangar.


Re: Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 8:01 PM, Rui Pereira ruipereira...@gmail.comwrote:

 Only one more question: doesn't full-import deletes all records before
 execution, or in this case only deletes the entities passed in the url?


If no 'entity' parameter is specified, a full-import deletes all existing
documents. But if a 'entity' is specified then the deleteQuery is not
executed. There's no way for DataImportHandler to figure out which documents
were generated by which entity.

You can use the 'preImportDeleteQuery' attribute on an entity to specify a
delete query which can delete the documents created by that entity.

http://wiki.apache.org/solr/DataImportHandler#head-70d3fdda52de9ee4fdb54e1c6f84199f0e1caa76

-- 
Regards,
Shalin Shekhar Mangar.


Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese

Hey,
Yeah, I patched the bug reported by Ryuuichi of the SimpleDateFormat aswell.
Is there any other known concurrency bug that maybe I am missing?
In my use case I could manage to index not concurrently but would like to
discover why this is happening...

Thank you very much!



Shalin Shekhar Mangar wrote:
 
 On Fri, Feb 20, 2009 at 8:41 PM, Marc Sturlese
 marc.sturl...@gmail.comwrote:
 

 Hey there,
 I am indexing 3 cores concurrently from 3 diferent mysql tables (I do it
 every 5 minutes with a cron job).
 The three cores use JdbcDataSource as datasource in data-config.xml
 Reached a point, the core that fetches more mysql rows starts running so
 so
 solw until the thread seems to stop (but the other tow keep working
 fine)...but java doesn't throw and exception...
 I am using a nightly from early january. I found someone experienced the
 same problem and uploaded a templateString patch to make it thread-save.

 
 Marc, I'd strongly recommend using a more recent nightly build. There was
 another problem related to unsafe usage of SimpleDateFormat which was
 fixed
 recently.
 
 See https://issues.apache.org/jira/browse/SOLR-1017 (which was fixed on
 11th
 Feb)
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/concurrency-problem-with-delta-import-%28indexing-various-cores-simultaniously%29-tp22120430p22123287.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 10:43 PM, Marc Sturlese marc.sturl...@gmail.comwrote:


 Hey,
 Yeah, I patched the bug reported by Ryuuichi of the SimpleDateFormat
 aswell.
 Is there any other known concurrency bug that maybe I am missing?
 In my use case I could manage to index not concurrently but would like to
 discover why this is happening...

 Thank you very much!


I don't see any obvious issue except for these two fixes. Are you
experiencing this problem even after applying both of Ryuuichi's fixes?

-- 
Regards,
Shalin Shekhar Mangar.


Question about etag

2009-02-20 Thread Pascal Dimassimo

Hi guys,
 
I'm having trouble understanding the behavior of firefox and the etag.
 
After cleaning the cache, I send this request from firefox:
 
GET /solr/select/?q=television HTTP/1.1
Host: localhost:8088
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) 
Gecko/2009011913 Firefox/3.0.6 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: JSESSIONID=AA71D602A701BB6287C60083DD6879CD
 
Which solr responds with:
 
HTTP/1.1 200 OK
Last-Modified: Thu, 19 Feb 2009 19:57:14 GMT
ETag: NmViOTJkMjc1ODgwMDAwMFNvbHI=
Content-Type: text/xml; charset=utf-8
Transfer-Encoding: chunked
Server: Jetty(6.1.3)
(#data following#)
 
So far so good. But then, I press F5 to refresh the page. Now if I understand 
correctly the way the etag works, firefox should send the request with a 
if-none-match along with the etag and then the server should return a 304 
not modified code.
 
But what happens is that firefox just don't send anything. In the firebug 
window, I only see 0 requests. Just to make sure I test with tcpmon and 
nothing is sent by firefox.
 
Is this making sense? Am I missing something?
 
My solrconfig.xml has this config:


 
 
Thanks!


_
The new Windows Live Messenger. You don’t want to miss this.
http://www.microsoft.com/windows/windowslive/products/messenger.aspx

Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese

Yes,
Now it's almost tree days non-stop since I am running updates with the 3
cores with cron jobs. If there are updates of 1 docs everything is
alrite. When I start doing updates of 30 is when that core runs really
slow. I have to abort the import in that core and keep updating with less
rows each time.
Another thing to point is that tomcat reaches the maximum memory I allow
(2Gig) and never goes down (but at least it doesn't run out of memory). Is
that normal? Shouldn't the memory go down a lot after an update is
completed?

Thank you very much!


Shalin Shekhar Mangar wrote:
 
 On Fri, Feb 20, 2009 at 10:43 PM, Marc Sturlese
 marc.sturl...@gmail.comwrote:
 

 Hey,
 Yeah, I patched the bug reported by Ryuuichi of the SimpleDateFormat
 aswell.
 Is there any other known concurrency bug that maybe I am missing?
 In my use case I could manage to index not concurrently but would like to
 discover why this is happening...

 Thank you very much!


 I don't see any obvious issue except for these two fixes. Are you
 experiencing this problem even after applying both of Ryuuichi's fixes?
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/concurrency-problem-with-delta-import-%28indexing-various-cores-simultaniously%29-tp22120430p22125443.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 11:23 PM, Marc Sturlese marc.sturl...@gmail.comwrote:


 Yes,
 Now it's almost tree days non-stop since I am running updates with the 3
 cores with cron jobs. If there are updates of 1 docs everything is
 alrite. When I start doing updates of 30 is when that core runs really
 slow. I have to abort the import in that core and keep updating with less
 rows each time.
 Another thing to point is that tomcat reaches the maximum memory I allow
 (2Gig) and never goes down (but at least it doesn't run out of memory). Is
 that normal? Shouldn't the memory go down a lot after an update is
 completed?


I guess you are being hit by garbage collection. Memory utilization should
go down once an import completes. Which GC are you using? There have been a
few recent threads on GC settings. Perhaps you can try out a few of those
settings. I don't know how big your documents/index are but if possible give
it more memory.

-- 
Regards,
Shalin Shekhar Mangar.


Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese

I am working with 3 index of 1 gig each. I am using the standard setting of
the GC, haven't changed anything and using java version 1.6.0_07.
I don't know so much about GV configuration... just read this

http://marcus.net/blog/2007/11/10/solr-search-and-java-gc-tuning/

when a month ago I exeprienced another problem with Solr (at the end it was
not GV's fault). So, any advice about wich GC should I try or what should I
tune?

Thank you very much!



Shalin Shekhar Mangar wrote:
 
 On Fri, Feb 20, 2009 at 11:23 PM, Marc Sturlese
 marc.sturl...@gmail.comwrote:
 

 Yes,
 Now it's almost tree days non-stop since I am running updates with the 3
 cores with cron jobs. If there are updates of 1 docs everything is
 alrite. When I start doing updates of 30 is when that core runs
 really
 slow. I have to abort the import in that core and keep updating with less
 rows each time.
 Another thing to point is that tomcat reaches the maximum memory I allow
 (2Gig) and never goes down (but at least it doesn't run out of memory).
 Is
 that normal? Shouldn't the memory go down a lot after an update is
 completed?

 
 I guess you are being hit by garbage collection. Memory utilization should
 go down once an import completes. Which GC are you using? There have been
 a
 few recent threads on GC settings. Perhaps you can try out a few of those
 settings. I don't know how big your documents/index are but if possible
 give
 it more memory.
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/concurrency-problem-with-delta-import-%28indexing-various-cores-simultaniously%29-tp22120430p22125716.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Updating a single field of a document

2009-02-20 Thread Amit Nithian
Thanks Otis. Are these Solr specific issues. In looking through Lucene's
FAQ, it seems that you would have to delete the document and re-add. Could a
possible solution be to find the document by the unique-id and set the
fields that were changed or would this not scale when doing a lot of
document field updates?
Which JIRA issues were you referring to?

Thanks
Amit

On Thu, Feb 19, 2009 at 6:57 PM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:


 Amit,

 This is still the case.  I believe 2 separate issues related to this exist
 in JIRA, but none is in a finished state.

 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



 - Original Message 
  From: Amit Nithian anith...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Friday, February 20, 2009 7:00:03 AM
  Subject: Updating a single field of a document
 
  Is there a way in Solr 1.2 (or Solr 1.3) to update a single field of an
  existing document if I know the primary key? Reason I ask is that I
  construct a document from multiple sources and some fields may need
 periodic
  updating from one of those sources. I would prefer not to have to
  reconstruct the entire document (and hence query the multiple sources)
 for a
  single field change.
  I noticed that Solr 1.2 will delete and add the new document rather than
  replace individual fields. Is there a way around this?
 
  Thanks
  Amit




Re: Updating a single field of a document

2009-02-20 Thread Shalin Shekhar Mangar
On Sat, Feb 21, 2009 at 1:00 AM, Amit Nithian anith...@gmail.com wrote:

 Thanks Otis. Are these Solr specific issues. In looking through Lucene's
 FAQ, it seems that you would have to delete the document and re-add. Could
 a
 possible solution be to find the document by the unique-id and set the
 fields that were changed or would this not scale when doing a lot of
 document field updates?
 Which JIRA issues were you referring to?


https://issues.apache.org/jira/browse/SOLR-139
https://issues.apache.org/jira/browse/SOLR-828

-- 
Regards,
Shalin Shekhar Mangar.


Re: Question about etag

2009-02-20 Thread Pascal Dimassimo

Sorry, the xml of the solrconfig.xml was lost. It is

httpCaching lastModifiedFrom=openTime etagSeed=Solr
/httpCaching



Hi guys,
 
I'm having trouble understanding the behavior of firefox and the etag.
 
After cleaning the cache, I send this request from firefox:
 
GET /solr/select/?q=television HTTP/1.1
Host: localhost:8088
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6)
Gecko/2009011913 Firefox/3.0.6 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: JSESSIONID=AA71D602A701BB6287C60083DD6879CD
 
Which solr responds with:
 
HTTP/1.1 200 OK
Last-Modified: Thu, 19 Feb 2009 19:57:14 GMT
ETag: NmViOTJkMjc1ODgwMDAwMFNvbHI=
Content-Type: text/xml; charset=utf-8
Transfer-Encoding: chunked
Server: Jetty(6.1.3)
(#data following#)
 
So far so good. But then, I press F5 to refresh the page. Now if I
understand correctly the way the etag works, firefox should send the request
with a if-none-match along with the etag and then the server should return
a 304 not modified code.
 
But what happens is that firefox just don't send anything. In the firebug
window, I only see 0 requests. Just to make sure I test with tcpmon and
nothing is sent by firefox.
 
Is this making sense? Am I missing something?
 
My solrconfig.xml has this config:


 
 
Thanks!

-- 
View this message in context: 
http://www.nabble.com/Question-about-etag-tp22125449p22127322.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Defining shards in solrconfig with multiple cores

2009-02-20 Thread Yonik Seeley
On Fri, Feb 20, 2009 at 10:32 AM, jdleider nab...@justinleider.com wrote:
 However when i try to /select using
 this shards param in the solrconfig.xml the query just hangs.

The basic /select url should normally not have shards set as a
default... this will cause infinite recursion when the top level
searcher sends requests to the sub-searchers until you exhaust all
threads and run into a distributed deadlock.  Set up another handler
with the default shards param instead.

-Yonik
Lucene/Solr? http://www.lucidimagination.com


mapping pdf metadata

2009-02-20 Thread Josh Joy
Hi,

I'm having trouble figuring out how to map the tika metadata fields to my
own solr schema document fields. I guess the first hurdle I need to
overcome, is where can I find a list of the Tika PDF metadata fields that
are available for mapping?

Thanks,
Josh


show first couple sentences from found doc

2009-02-20 Thread Josh Joy
Hi,

I would like to do something similar to Google, in that for my list of hits,
I would like to grab the surrounding text around my query term so I can
include that in my search results. What's the easiest way to do this?

Thanks,
Josh


Re: show first couple sentences from found doc

2009-02-20 Thread Koji Sekiguchi

Josh Joy wrote:

Hi,

I would like to do something similar to Google, in that for my list of hits,
I would like to grab the surrounding text around my query term so I can
include that in my search results. What's the easiest way to do this?

Thanks,
Josh

  


Highlighter?

http://wiki.apache.org/solr/HighlightingParameters

Koji




Re: mapping pdf metadata

2009-02-20 Thread Otis Gospodnetic

Josh,

You didn't mention whether you are using 
http://wiki.apache.org/solr/ExtractingRequestHandler , but if you are not, 
maybe this already has what you need: 
http://wiki.apache.org/solr/ExtractingRequestHandler#head-c413be32c951c89c0a28f4f8336aa7d2774ec2d6

 
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Josh Joy joshjd...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Saturday, February 21, 2009 9:11:01 AM
 Subject: mapping pdf metadata
 
 Hi,
 
 I'm having trouble figuring out how to map the tika metadata fields to my
 own solr schema document fields. I guess the first hurdle I need to
 overcome, is where can I find a list of the Tika PDF metadata fields that
 are available for mapping?
 
 Thanks,
 Josh



Re: mapping pdf metadata

2009-02-20 Thread Erik Hatcher
And when you do use the ExtractingRequestHandler (aka Solr Cell), you  
can find the metadata fields by using the ext.extract.only=true setting.


You might also find this article by Sami Siren helpful: http://www.lucidimagination.com/index.php?option=com_contenttask=viewid=106 



Erik


On Feb 20, 2009, at 8:39 PM, Otis Gospodnetic wrote:



Josh,

You didn't mention whether you are using http://wiki.apache.org/solr/ExtractingRequestHandler 
 , but if you are not, maybe this already has what you need: http://wiki.apache.org/solr/ExtractingRequestHandler#head-c413be32c951c89c0a28f4f8336aa7d2774ec2d6



Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Josh Joy joshjd...@gmail.com
To: solr-user@lucene.apache.org
Sent: Saturday, February 21, 2009 9:11:01 AM
Subject: mapping pdf metadata

Hi,

I'm having trouble figuring out how to map the tika metadata fields  
to my

own solr schema document fields. I guess the first hurdle I need to
overcome, is where can I find a list of the Tika PDF metadata  
fields that

are available for mapping?

Thanks,
Josh




Suggested hardening of Solr schema.jsp admin interface

2009-02-20 Thread Peter Wolanin
My colleague Paul opened this issue and supplied a patch and I
commented on it regarding a potential security weakness in the admin
interface:

https://issues.apache.org/jira/browse/SOLR-1031


-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com


What is the performance impact of a fq that matches all docs?

2009-02-20 Thread Peter Wolanin
We are working on integration with the Drupal CMS, and so are writing
code that carries out operations that might only be relevant for only
a small subset of the sites/indexes that might use the integration
module.  In this regard, I'm wondering if adding to the query (using
the dismax or mlt handlers) a fq that matches all documents would have
any impact on performance?  I gatehr that there is caching for the fq
matches, but it seems liek that would still incur some overhead,
especially for a large index?

As a more concrete example, suppose each document has a string field
that names the role of user that is allowed to see the content.  e.g.
'public', 'registered', 'admin'.  Most sites have only public content,
but because our code is generic, we might add  fq=role:public to
every query.  What would the expected performance effect be compared
to omitting that fq if, for example, we had a way to determine in
advance that all site content matches 'public'.

Thanks,

Peter

-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com