[jira] Commented: (SOLR-1229) deletedPkQuery feature does not work when pk and uniqueKey field do not have the same value

2009-08-23 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746584#action_12746584
 ] 

Noble Paul commented on SOLR-1229:
--

bq.what is this broken functionality?

Till this fix the user provided 'pk' was always honoured. Now, the derived pk 
can never be overridden. My latest patch has the fix and corresponding changes 
to the tests. 

> deletedPkQuery feature does not work when pk and uniqueKey field do not have 
> the same value
> ---
>
> Key: SOLR-1229
> URL: https://issues.apache.org/jira/browse/SOLR-1229
> Project: Solr
>  Issue Type: Bug
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4
>Reporter: Erik Hatcher
>Assignee: Erik Hatcher
> Fix For: 1.4
>
> Attachments: SOLR-1229.patch, SOLR-1229.patch, SOLR-1229.patch, 
> SOLR-1229.patch, SOLR-1229.patch, SOLR-1229.patch, tests.patch
>
>
> Problem doing a delta-import such that records marked as "deleted" in the 
> database are removed from Solr using deletedPkQuery.
> Here's a config I'm using against a mocked test database:
> {code:xml}
> 
>  
>  
>pk="board_id"
>transformer="TemplateTransformer"
>deletedPkQuery="select board_id from boards where deleted = 'Y'"
>query="select * from boards where deleted = 'N'"
>deltaImportQuery="select * from boards where deleted = 'N'"
>deltaQuery="select * from boards where deleted = 'N'"
>preImportDeleteQuery="datasource:board">
>  
>  
>  
>
>  
> 
> {code}
> Note that the uniqueKey in Solr is the "id" field.  And its value is a 
> template board-.
> I noticed the javadoc comments in DocBuilder#collectDelta it says "Note: In 
> our definition, unique key of Solr document is the primary key of the top 
> level entity".  This of course isn't really an appropriate assumption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1375) BloomFilter on a field

2009-08-23 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746591#action_12746591
 ] 

Andrzej Bialecki  commented on SOLR-1375:
-

See here for a Java impl. of FastBits: 
http://code.google.com/p/compressedbitset/ .

Re: BloomFilters - in BloomIndexComponent you seem to assume that when 
BloomKeySet.contains(key) returns true then a key exists in a set. This is not 
strictly speaking true. You can only be sure with 1.0 probability that a key 
does NOT exist in a set, for other key when the result is true you only have a 
(1.0 - eps) probability that the answer is correct, i.e. the BloomFilter will 
return a false positive result for non-existent keys, with (eps) probability. 
You should take this into account when writing client code.

> BloomFilter on a field
> --
>
> Key: SOLR-1375
> URL: https://issues.apache.org/jira/browse/SOLR-1375
> Project: Solr
>  Issue Type: New Feature
>  Components: update
>Affects Versions: 1.4
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 1.5
>
> Attachments: SOLR-1375.patch, SOLR-1375.patch, SOLR-1375.patch
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> * A bloom filter is a read only probabilistic set. Its useful
> for verifying a key exists in a set, though it returns false
> positives. http://en.wikipedia.org/wiki/Bloom_filter 
> * The use case is indexing in Hadoop and checking for duplicates
> against a Solr cluster (which when using term dictionary or a
> query) is too slow and exceeds the time consumed for indexing.
> When a match is found, the host, segment, and term are returned.
> If the same term is found on multiple servers, multiple results
> are returned by the distributed process. (We'll need to add in
> the core name I just realized). 
> * When new segments are created, and commit is called, a new
> bloom filter is generated from a given field (default:id) by
> iterating over the term dictionary values. There's a bloom
> filter file per segment, which is managed on each Solr shard.
> When segments are merged away, their corresponding .blm files is
> also removed. In a future version we'll have a central server
> for the bloom filters so we're not abusing the thread pool of
> the Solr proxy and the networking of the Solr cluster (this will
> be done sooner than later after testing this version). I held
> off because the central server requires syncing the Solr
> servers' files (which is like reverse replication). 
> * The patch uses the BloomFilter from Hadoop 0.20. I want to jar
> up only the necessary classes so we don't have a giant Hadoop
> jar in lib.
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/bloom/BloomFilter.html
> * Distributed code is added and seems to work, I extended
> TestDistributedSearch to test over multiple HTTP servers. I
> chose this approach rather than the manual method used by (for
> example) TermVectorComponent.testDistributed because I'm new to
> Solr's distributed search and wanted to learn how it works (the
> stages are confusing). Using this method, I didn't need to setup
> multiple tomcat servers and manually execute tests.
> * We need more of the bloom filter options passable via
> solrconfig
> * I'll add more test cases

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: multivalued integer testing

2009-08-23 Thread Yonik Seeley
On Sat, Aug 22, 2009 at 5:36 PM, smock wrote:
> Hello,
> I'm trying to write a test using a multivalued integer, for the
> StatsComponents.  I've defined a field with multivalued="true" and was able
> to successfully add a document with multiple integer values.  However, when
> testing if the fieldtype is multivalued:
> ft.isMultiValued()

A FieldType is shared with all fields using that type, and many of the
boolean flags such as "multiValued" act as defaults that specific
fields can override.

So try schemaField.isMultiValued()

-Yonik
http://www.lucidimagination.com



> the testing environment returns false, which screws up the component I'm
> trying to test.  The component works fine in a production environment - I'm
> wondering what I'm doing wrong here?
>
> Thanks,
> -Harish
> --
> View this message in context: 
> http://www.nabble.com/multivalued-integer-testing-tp25097576p25097576.html
> Sent from the Solr - Dev mailing list archive at Nabble.com.
>
>


[jira] Created: (SOLR-1380) Extend StatsComponent to MultiValued Fields

2009-08-23 Thread Harish Agarwal (JIRA)
Extend StatsComponent to MultiValued Fields
---

 Key: SOLR-1380
 URL: https://issues.apache.org/jira/browse/SOLR-1380
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.4
Reporter: Harish Agarwal
Priority: Minor
 Fix For: 1.4


The StatsComponent does not work on multivalued fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-1380) Extend StatsComponent to MultiValued Fields

2009-08-23 Thread Harish Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Agarwal updated SOLR-1380:
-

Attachment: SOLR-1380.patch

Patch to extend StatsComponent to multivalued fields.  Please review, suggest, 
criticize!

> Extend StatsComponent to MultiValued Fields
> ---
>
> Key: SOLR-1380
> URL: https://issues.apache.org/jira/browse/SOLR-1380
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Affects Versions: 1.4
>Reporter: Harish Agarwal
>Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-1380.patch
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> The StatsComponent does not work on multivalued fields.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: multivalued integer testing

2009-08-23 Thread smock

Thanks, that worked!


Yonik Seeley-2 wrote:
> 
> On Sat, Aug 22, 2009 at 5:36 PM, smock wrote:
>> Hello,
>> I'm trying to write a test using a multivalued integer, for the
>> StatsComponents.  I've defined a field with multivalued="true" and was
>> able
>> to successfully add a document with multiple integer values.  However,
>> when
>> testing if the fieldtype is multivalued:
>> ft.isMultiValued()
> 
> A FieldType is shared with all fields using that type, and many of the
> boolean flags such as "multiValued" act as defaults that specific
> fields can override.
> 
> So try schemaField.isMultiValued()
> 
> -Yonik
> http://www.lucidimagination.com
> 
> 
> 
>> the testing environment returns false, which screws up the component I'm
>> trying to test.  The component works fine in a production environment -
>> I'm
>> wondering what I'm doing wrong here?
>>
>> Thanks,
>> -Harish
>> --
>> View this message in context:
>> http://www.nabble.com/multivalued-integer-testing-tp25097576p25097576.html
>> Sent from the Solr - Dev mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/multivalued-integer-testing-tp25097576p25104396.html
Sent from the Solr - Dev mailing list archive at Nabble.com.



statscomponent extension to multivalued fields

2009-08-23 Thread smock

Hello,

I uploaded a patch to the StatsComponent to extend it to multivalued fields:
https://issues.apache.org/jira/browse/SOLR-1380

I am new to contributing and would appreciate any comments, criticisms,
suggestions which can help get this incorporated into solr.

-Harish
-- 
View this message in context: 
http://www.nabble.com/statscomponent-extension-to-multivalued-fields-tp25104416p25104416.html
Sent from the Solr - Dev mailing list archive at Nabble.com.



[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr

2009-08-23 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746682#action_12746682
 ] 

Chris Male commented on SOLR-773:
-

Brad,

Would you be willing to add your FunctionQuery to the issue?

> Incorporate Local Lucene/Solr
> -
>
> Key: SOLR-773
> URL: https://issues.apache.org/jira/browse/SOLR-773
> Project: Solr
>  Issue Type: New Feature
>Reporter: Grant Ingersoll
>Assignee: Grant Ingersoll
>Priority: Minor
> Fix For: 1.5
>
> Attachments: exampleSpatial.zip, lucene-spatial-2.9-dev.jar, 
> lucene.tar.gz, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, 
> SOLR-773-local-lucene.patch, SOLR-773-spatial_solr.patch, SOLR-773.patch, 
> SOLR-773.patch, spatial-solr.tar.gz
>
>
> Local Lucene has been donated to the Lucene project.  It has some Solr 
> components, but we should evaluate how best to incorporate it into Solr.
> See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-1335) load core properties from a properties file

2009-08-23 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746746#action_12746746
 ] 

Noble Paul commented on SOLR-1335:
--

Please let me know if anyone wants to change anything about this feature before 
committing this

> load core properties from a properties file
> ---
>
> Key: SOLR-1335
> URL: https://issues.apache.org/jira/browse/SOLR-1335
> Project: Solr
>  Issue Type: New Feature
>Reporter: Noble Paul
>Assignee: Noble Paul
> Fix For: 1.4
>
> Attachments: SOLR-1335.patch, SOLR-1335.patch, SOLR-1335.patch, 
> SOLR-1335.patch
>
>
> There are  few ways of loading properties in runtime,
> # using env property using in the command line
> # if you use a multicore drop it in the solr.xml
> if not , the only way is to  keep separate solrconfig.xml for each instance.  
> #1 is error prone if the user fails to start with the correct system 
> property. 
> In our case we have four different configurations for the same deployment  . 
> And we have to disable replication of solrconfig.xml. 
> It would be nice if I can distribute four properties file so that our ops can 
> drop  the right one and start Solr. Or it is possible for the operations to 
> edit a properties file  but it is risky to edit solrconfig.xml if he does not 
> understand solr
> I propose a properties file in the instancedir as solrcore.properties . If 
> present would be loaded and added as core specific properties.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.