Re: Solr 4.x auto-increment/sequence/counter functionality.

2013-03-10 Thread mark12345
A slightly different approach.

* I noticed that I can sort by the internal Lucene _docid_.

-   http://wiki.apache.org/solr/CommonQueryParameters
http://wiki.apache.org/solr/CommonQueryParameters  

 You can sort by index id using sort=_docid_ asc or sort=_docid_ desc

* I have also read the docid is represented by a sequential number.

-  
http://lucene.472066.n3.nabble.com/Get-DocID-after-Document-insert-td556278.html
http://lucene.472066.n3.nabble.com/Get-DocID-after-Document-insert-td556278.html
  

  Your document IDs may change, and in fact *will* change if you delete a
 document and then optimize. Say you index 100 docs, delete number 50 and
 optimize. Documents that originally had IDs 51-100 will now have IDs 50-99
 and your hierarchy will be messed up. 


So there is a slight chance that the _docid_ might represent document
creation order.  Does anyone have knowledge and experience with the
internals of the Lucene _docid_ field?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-x-auto-increment-sequence-counter-functionality-tp4045125p4046137.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.x auto-increment/sequence/counter functionality.

2013-03-08 Thread mark12345
So I think I took the easiest option by creating an UpdateRequestProcessor
implementation (I was unsure of the performance implications and object
model of ScriptUpdateProcessor).  The below
DocumentCreationDetailsProcessorFactory class seems to achieve my aim of
allowing me to sort my Solr Documents by a creation order (To an extent - I
don't think it is exactly the commit order..), though the
auto-increment/sequence/counter functionality is not continuous.

Solr Sort Parameter String:
sort=created_time_stamp_l asc, created_processing_sequence_number_l asc,
created_by_solr_thread_id_l asc, created_by_solr_core_name_s asc,
created_by_solr_shard_id_s asc


Any comments or feedback would be appreciated.

//
// UpdateRequestProcessor implementation
//
public class DocumentCreationDetailsProcessorFactory extends
UpdateRequestProcessorFactory {

private static final AtomicLong processingSequenceNumber = new
AtomicLong();

@Override
public UpdateRequestProcessor getInstance(SolrQueryRequest req,
SolrQueryResponse rsp, UpdateRequestProcessor next) {
return new DocumentCreationDetailsProcessor(req, rsp, next,
processingSequenceNumber);
}
}

class DocumentCreationDetailsProcessor extends UpdateRequestProcessor {

private final SolrQueryRequest req;

@SuppressWarnings(unused)
private final SolrQueryResponse rsp;

@SuppressWarnings(unused)
private final UpdateRequestProcessor next;

private final AtomicLong processingSequenceNumber;


public DocumentCreationDetailsProcessor(SolrQueryRequest req,
SolrQueryResponse rsp, UpdateRequestProcessor next, AtomicLong
processingSequenceNumber ) {
super(next);

this.req = req;
this.rsp = rsp;
this.next = next;

this.processingSequenceNumber = processingSequenceNumber;

}

@Override
public void processAdd(AddUpdateCommand cmd) throws IOException {

SolrInputDocument solrInputDocument = cmd.getSolrInputDocument();

solrInputDocument.addField(created_time_stamp_l,
System.currentTimeMillis());

solrInputDocument.addField(created_processing_sequence_number_l,
processingSequenceNumber.incrementAndGet());

String solrCoreName = null;
String solrShardId = null;

if (req != null
 req.getCore() != null
 req.getCore().getCoreDescriptor() != null
) {

SolrCore solrCore = req.getCore();
CoreDescriptor coreDesc = null;
CloudDescriptor cloudDesc = null;

if ( solrCore != null ) {
solrCoreName = solrCore.getName();
coreDesc = req.getCore().getCoreDescriptor();

if (coreDesc != null) {

cloudDesc = coreDesc.getCloudDescriptor();
}

if (cloudDesc != null) {
solrShardId = cloudDesc.getShardId();
}
}
}


solrInputDocument.addField(created_by_solr_thread_id_l,
Thread.currentThread().getId());
solrInputDocument.addField(created_by_solr_core_name_s,
solrCoreName);
solrInputDocument.addField(created_by_solr_shard_id_s,
solrShardId);


// pass it up the chain
super.processAdd(cmd);
}
}
//



//
//  Added the below for a bit of context
(http://wiki.apache.org/solr/SolrPlugins)
//

mkdir /opt/solr/instances/test/collection1/lib
cp /home/user/download/test-solr-plugins-0.0.1.jar
/opt/solr/instances/test/collection1/lib/
chown root:tomcat7 /opt/solr/instances/test/collection1/lib/*

vim /opt/solr/instances/test/collection1/conf/solrconfig.xml
updateRequestProcessorChain name=mychain
processor
class=com.test.solr.plugins.DocumentCreationDetailsProcessorFactory
/processor
processor class=solr.LogUpdateProcessorFactory /
processor class=solr.RunUpdateProcessorFactory /
/updateRequestProcessorChain


vim /opt/solr/instances/test/collection1/conf/solrconfig.xml
requestHandler name=/update class=solr.UpdateRequestHandler
lst name=defaults
str name=update.chainmychain/str
/lst
/requestHandler




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-x-auto-increment-sequence-counter-functionality-tp4045125p4045725.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.x auto-increment/sequence/counter functionality.

2013-03-06 Thread Otis Gospodnetic
Hi,

How about a custom UpdateRequestProcessor that uses milliseconds or even
nanoseconds and stores them in some field?  If that is enough resolution
and you still want to avoid collision, append a random letter/string/number
to it, a la millis or nanos_extra stuff to make it unique.

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Wed, Mar 6, 2013 at 2:31 AM, marks1900-pos...@yahoo.com.au 
marks1900-pos...@yahoo.com.au wrote:


 I am looking into how to add auto-increment/sequence/counter functionality
 to Solr 4.x. I specifically want to do this, so that I have numeric field
 which records the document insertion order that can be sorted against. This
 numeric field would have to be unique and not be allowed to change over
 time.  Unfortunately using a insertion date would
 provide numerous collisions.  Any feedback or ideas on an approach that
 would help me achieve this would be appreciated.

 I am thinking that this could be achieved multiple ways:
 * Via Remote Solr Document calls.  (A Solr Singleton for remote calls +
 Solr calls to get the current sequence value and then a call to increment
 the value )
 * A Solr Plugin (extend RequestHandlerBase - 
 http:///sequence?q=namesize=1000 and
 return the next sequence/counter number )
 * Using a standard RDBMS such as PostgreSQL.
 * Some special Solr/Lucene functionality that I don't know about.

 The closest information I could find is outlined here:
 http://lucene.472066.n3.nabble.com/counter-field-td3886549.html


 A bit more background:

 I am using Solr as a NoSQL solution with great text search capabilities.
  Currently, I am inserting beans using SolrJ and each of these beans has an
 id which is comprised of bean string type (Such as CUSTOMER, BOOK,
 STORE ) concatenated with a unique bean type identifier string ( Customer
 - UUID.randomUUID().toString().toLowerCase(Locale.ENGLISH), Book - ISDN,
 Store - name).  For instance,
 CUSTOMER-b245659b-825c-4357-aab0-6d592468889a, BOOK-978-1782161325 or
 STORE-TheUniquelyNamedStore.  Ideally I am aiming to add a numeric field
 to these beans that represents insertion position, that will then be used
 as a sorting field.



Re: Solr 4.x auto-increment/sequence/counter functionality.

2013-03-06 Thread mark12345
Appending a random value only reduces the chance of a collision (And I need
to ensure continuous uniqueness) and could hurt how the field is later
sorted.  I have not written a custom UpdateRequestProcessor before, is there
a way to incorporate a Singleton that ensures one instance across a cluster? 
SolrCloud?

I guess the main thing is that I want the value would also be kept unique
across a cluster of Solr instances.As far as I know in Solr, the only
*free* uniqueness check is with the uniqueKeyid/uniqueKey declaration
in schema.xml.  Are there other options that I should be considering?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-x-auto-increment-sequence-counter-functionality-tp4045125p4045239.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.x auto-increment/sequence/counter functionality.

2013-03-06 Thread Timothy Potter
This sounds like a job for Zookeeper (distributed coordination is what it does).

Take a look at:
http://zookeeper-user.578899.n2.nabble.com/Sequence-Number-Generation-With-Zookeeper-td5378618.html

On Wed, Mar 6, 2013 at 10:00 AM, mark12345
marks1900-pos...@yahoo.com.au wrote:
 Appending a random value only reduces the chance of a collision (And I need
 to ensure continuous uniqueness) and could hurt how the field is later
 sorted.  I have not written a custom UpdateRequestProcessor before, is there
 a way to incorporate a Singleton that ensures one instance across a cluster?
 SolrCloud?

 I guess the main thing is that I want the value would also be kept unique
 across a cluster of Solr instances.As far as I know in Solr, the only
 *free* uniqueness check is with the uniqueKeyid/uniqueKey declaration
 in schema.xml.  Are there other options that I should be considering?



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-4-x-auto-increment-sequence-counter-functionality-tp4045125p4045239.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.x auto-increment/sequence/counter functionality.

2013-03-06 Thread Upayavira
If you want to mess with UpdateRequestProcessors, try the
ScriptUpdateProcessor, with which you can write your update logic in
Javascript. That would allow you to add your unique field. Use something
like timestamp+threadno+shardno and you'd have something unique
(assuming you can access those from Javascript).

Upayavira

On Wed, Mar 6, 2013, at 03:42 PM, Timothy Potter wrote:
 This sounds like a job for Zookeeper (distributed coordination is what it
 does).
 
 Take a look at:
 http://zookeeper-user.578899.n2.nabble.com/Sequence-Number-Generation-With-Zookeeper-td5378618.html
 
 On Wed, Mar 6, 2013 at 10:00 AM, mark12345
 marks1900-pos...@yahoo.com.au wrote:
  Appending a random value only reduces the chance of a collision (And I need
  to ensure continuous uniqueness) and could hurt how the field is later
  sorted.  I have not written a custom UpdateRequestProcessor before, is there
  a way to incorporate a Singleton that ensures one instance across a cluster?
  SolrCloud?
 
  I guess the main thing is that I want the value would also be kept unique
  across a cluster of Solr instances.As far as I know in Solr, the only
  *free* uniqueness check is with the uniqueKeyid/uniqueKey declaration
  in schema.xml.  Are there other options that I should be considering?
 
 
 
  --
  View this message in context: 
  http://lucene.472066.n3.nabble.com/Solr-4-x-auto-increment-sequence-counter-functionality-tp4045125p4045239.html
  Sent from the Solr - User mailing list archive at Nabble.com.