date:20210210

Re: support Rich Document

2021-02-10 Thread Jörn Franke

You can store them on the filesystem and a link to them in Solr. Your search 
application could fetch them from the filesystem and serve them to the users. 

Alternatively serve them as WebDAV, SharePoint or whatever your organization 
sets as standard.

It does not make sense to store them in Solr - they would just blow up the 
index without any value.

> Am 11.02.2021 um 05:08 schrieb Luke :
> 
> HI,
> 
> I know Solr can index rich documents, but I have one requirement.
> 
> I have all kind of documents, such as word, pdf, excel, ppt, jpg etcs
> 
> when Solr indexes them with Tika or OCR, it will extract text and save to
> solr, but the format will be lost, so when the user opens the document, it
> is not readable.
> 
> My question is whether Solr can keep original documents somewhere, such as
> external field, when I load documents, the original document can be
> retrieved too.
> 
> thanks

Down Replica is elected as Leader (solr v8.7.0)

2021-02-10 Thread mmb1234

Hello,

On reboot of one of the solr nodes in the cluster, we often see a
collection's shards with
1. LEADER replica in DOWN state, and/or
2. shard with no LEADER

Output from /solr/admin/collections?action=CLUSTERSTATUS is below.

Even after 5 to 10 minutes, the collection often does not recover. Unclear
why this is happening and what we can try to prevent or remedy it.

ps: perReplicaState= true in solr v8.8.0 didn't work well because after a
rebalance all replicas somehow get a "leader:true" status even though
states.json looked ok.

{
  "responseHeader": {
"status": 0,
"QTime": 2
  },
  "cluster": {
"collections": {
  "datacore": {
"pullReplicas": "0",
"replicationFactor": "0",
"shards": {
  "__": {
"range": null,
"state": "active",
"replicas": {
  "core_node1": {
"core": "datacore____replica_t187",
"base_url": "http://solr-0.solr-headless:8983/solr;,
"node_name": "solr-0.solr-headless:8983_solr",
"state": "down",
"type": "TLOG",
"force_set_state": "false",
"property.preferredleader": "true",
"leader": "true"
  },
  "core_node2": {
"core": "datacore____replica_t188",
"base_url": "http://solr-1.solr-headless:8983/solr;,
"node_name": "solr-1.solr-headless:8983_solr",
"state": "active",
"type": "TLOG",
"force_set_state": "false"
  },
  "core_node3": {
"core": "datacore____replica_t189",
"base_url": "http://solr-2.solr-headless:8983/solr;,
"node_name": "solr-2.solr-headless:8983_solr",
"state": "active",
"type": "TLOG",
"force_set_state": "false"
  }
}
  },
  "__j": {
"range": null,
"state": "active",
"replicas": {
  "core_node19": {
"core": "datacore___j_replica_t187",
"base_url": "http://solr-0.solr-headless:8983/solr;,
"node_name": "solr-0.solr-headless:8983_solr",
"state": "down",
"type": "TLOG",
"force_set_state": "false",
"property.preferredleader": "true"
  },
  "core_node20": {
"core": "datacore___j_replica_t188",
"base_url": "http://solr-1.solr-headless:8983/solr;,
"node_name": "solr-1.solr-headless:8983_solr",
"state": "active",
"type": "TLOG",
"force_set_state": "false"
  },
  "core_node21": {
"core": "datacore___j_replica_t189",
"base_url": "http://solr-2.solr-headless:8983/solr;,
"node_name": "solr-2.solr-headless:8983_solr",
"state": "active",
"type": "TLOG",
"force_set_state": "false"
  }
}
  },
  "__": {
"range": null,
"state": "active",
"replicas": {
  "core_node4": {
"core": "datacore____replica_t91",
"base_url": "http://solr-0...



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

support Rich Document

2021-02-10 Thread Luke

HI,

I know Solr can index rich documents, but I have one requirement.

I have all kind of documents, such as word, pdf, excel, ppt, jpg etcs

when Solr indexes them with Tika or OCR, it will extract text and save to
solr, but the format will be lost, so when the user opens the document, it
is not readable.

My question is whether Solr can keep original documents somewhere, such as
external field, when I load documents, the original document can be
retrieved too.

thanks

Re: UPDATE collection's Rule-based Replica Placement

2021-02-10 Thread Ilan Ginzburg

Do you look for something that would move existing collection replicas
to comply with a new set of rules?
I'm afraid that doesn't exist, but you can use the Collection API to
move replicas "manually".

Ilan

On Tue, Feb 9, 2021 at 1:10 PM mosheB  wrote:
>
> Hi community,
> Using Solr 8.3, is there any way to change the replica placment of "running"
> collection say "from this point forward" or should I recreate the collection
> and migrate all my data from the existing collection to the new one?
> Tried to use the COLLECTIONPROP action which doesn't do the job, instead it
> just update collectionprops.json file and not really affect the replica
> placement enforcement.
>
> Thanks!
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Collection Creation across DC

2021-02-10 Thread Revas

Hello,

Can we create a collection across data Center ( shard replica is in a
different data center)
for HA ?

Thanks
Revas

Index rich document and view

2021-02-10 Thread Luke Oak

Hi,

I have all kind of rich documents, such as excel, ppt, PDF, word, jpg ..., I 
knew Tika or ocr can convert them to text and index it. But when I open the 
document, the format is changed,  how can I keep original document format, is 
it possible in solr?

If not, can I use external field type to save original file and load it when I 
want to view the document?

Thanks 

Sent from my iPhone

Without custom updateRequestProcessorChain: DistributedUpdateProcessorFactory was explicitly disabled from this updateRequestProcessorChain

2021-02-10 Thread diego_70

Hello,

We are using SOLR cloud 8.5.

Several times per hour we can see these kind of errors in logs:

/RunUpdateProcessor has received an AddUpdateCommand containing a document
that appears to still contain Atomic document update operations, most likely
because DistributedUpdateProcessorFactory was explicitly disabled from this
updateRequestProcessorChain./

As far as I understood this error is related to customized
updateRequestProcessorChain. But the main concern is that we are not using
those features, we have not defined any new 
updateRequestProcessorChain. We use the default 
default-update-request-processor-chain

  
.

I'm not able to reproduce the issue in a test environment, the same update
works fine in a test environment.

A tipical failing update contains several atomic updates following the
structure "field": { "set": "value"} , "field1": { "set": "value1"}, etc..

Do you have any idea of what could be the root cause?, maybe performance
issues?, too much load?, any problem with the tlog?

Thanks in advance


Diego
--
Senior Software Engineer
Telefónica Cybersecurity & Cloud Tech



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: support Rich Document

Down Replica is elected as Leader (solr v8.7.0)

support Rich Document

Re: UPDATE collection's Rule-based Replica Placement

Collection Creation across DC

Index rich document and view

Without custom updateRequestProcessorChain: DistributedUpdateProcessorFactory was explicitly disabled from this updateRequestProcessorChain

7 matches

Site Navigation

Mail list logo

Footer information