Re: Deleting Fields

2015-06-01 Thread Charlie Hull

On 30/05/2015 00:30, Shawn Heisey wrote:

On 5/29/2015 5:08 PM, Joseph Obernberger wrote:

Hi All - I have a lot of fields to delete, but noticed that once I
started deleting them, I quickly ran out of heap space.  Is
delete-field a memory intensive operation?  Should I delete one field,
wait a while, then delete the next?


I'm not aware of a way to delete a field.  I may have a different
definition of what a field is than you do, though.

Solr lets you delete entire documents, but deleting a field from the
entire index would involve re-indexing every document in the index,
excluding that field.

Can you be more specific about exactly what you are doing, what you are
seeing, and what you want to see instead?

Also, please be aware of this:

http://people.apache.org/~hossman/#threadhijack

Thanks,
Shawn


Here's a rather old post on how we did something similar:
http://www.flax.co.uk/blog/2011/06/24/how-to-remove-a-stored-field-in-lucene/

Cheers

Charlie

--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk


Re: Deleting Fields

2015-06-01 Thread Joseph Obernberger
(ScopedHandler.java:137)

 at


org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)

 at


org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)

 at


org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)

 at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
 at


org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)

 at


org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)

 at


org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)

 at


org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)

 at


org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)

 at


org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)

 at org.eclipse.jetty.server.Server.handle(Server.java:368)
 at


org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)

 at


org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)

 at


org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)

 at


org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)

 at

org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)

 at
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
 at


org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)

 at


org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)

 at


org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)

 at


org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)

 at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space



-Joe


On 5/30/2015 12:32 AM, Erick Erickson wrote:

Yes, but deleting fields from the schema only means that _future_
documents will throw an undefined field error. All the documents
currently in the index will retain that field.

Why you're hitting an OOM is a mystery though. But delete field isn't
removing the contents if indexed documents. Showing us the full stack
when you hit an OOM would be helpful.

Best,
Erick

On Fri, May 29, 2015 at 4:58 PM, Joseph Obernberger
j...@lovehorsepower.com wrote:

Thank you Shawn - I'm referring to fields in the schema.  With Solr 5,
you
can delete fields from the schema.



https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-DeleteaField

-Joe


On 5/29/2015 7:30 PM, Shawn Heisey wrote:

On 5/29/2015 5:08 PM, Joseph Obernberger wrote:

Hi All - I have a lot of fields to delete, but noticed that once I
started deleting them, I quickly ran out of heap space.  Is
delete-field a memory intensive operation?  Should I delete one

field,

wait a while, then delete the next?

I'm not aware of a way to delete a field.  I may have a different
definition of what a field is than you do, though.

Solr lets you delete entire documents, but deleting a field from the
entire index would involve re-indexing every document in the index,
excluding that field.

Can you be more specific about exactly what you are doing, what you

are

seeing, and what you want to see instead?

Also, please be aware of this:

http://people.apache.org/~hossman/#threadhijack

Thanks,
Shawn






Re: Deleting Fields

2015-05-31 Thread Tomasz Borek
)
  at
  org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
  at
 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
  at
 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
  at
 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
  at
 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: java.lang.OutOfMemoryError: Java heap space
 
 
 
  -Joe
 
 
  On 5/30/2015 12:32 AM, Erick Erickson wrote:
 
  Yes, but deleting fields from the schema only means that _future_
  documents will throw an undefined field error. All the documents
  currently in the index will retain that field.
 
  Why you're hitting an OOM is a mystery though. But delete field isn't
  removing the contents if indexed documents. Showing us the full stack
  when you hit an OOM would be helpful.
 
  Best,
  Erick
 
  On Fri, May 29, 2015 at 4:58 PM, Joseph Obernberger
  j...@lovehorsepower.com wrote:
 
  Thank you Shawn - I'm referring to fields in the schema.  With Solr 5,
  you
  can delete fields from the schema.
 
 
 https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-DeleteaField
 
  -Joe
 
 
  On 5/29/2015 7:30 PM, Shawn Heisey wrote:
 
  On 5/29/2015 5:08 PM, Joseph Obernberger wrote:
 
  Hi All - I have a lot of fields to delete, but noticed that once I
  started deleting them, I quickly ran out of heap space.  Is
  delete-field a memory intensive operation?  Should I delete one
 field,
  wait a while, then delete the next?
 
  I'm not aware of a way to delete a field.  I may have a different
  definition of what a field is than you do, though.
 
  Solr lets you delete entire documents, but deleting a field from the
  entire index would involve re-indexing every document in the index,
  excluding that field.
 
  Can you be more specific about exactly what you are doing, what you
 are
  seeing, and what you want to see instead?
 
  Also, please be aware of this:
 
  http://people.apache.org/~hossman/#threadhijack
 
  Thanks,
  Shawn
 
 
 



Re: Deleting Fields

2015-05-30 Thread Erick Erickson
)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)
 ... 26 more

 Then later:

 ERROR - 2015-05-29 21:57:22.370; [UNCLASS shard9 core_node14 UNCLASS]
 org.apache.solr.common.SolrException; null:java.lang.RuntimeException:
 java.lang.OutOfMemoryError: Java heap space
 at
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:854)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:463)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
 at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
 at
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
 at
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
 at
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
 at
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
 at
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
 at
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
 at
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
 at
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
 at org.eclipse.jetty.server.Server.handle(Server.java:368)
 at
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
 at
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
 at
 org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
 at
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
 at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
 at
 org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
 at
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
 at
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
 at
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
 at
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.OutOfMemoryError: Java heap space



 -Joe


 On 5/30/2015 12:32 AM, Erick Erickson wrote:

 Yes, but deleting fields from the schema only means that _future_
 documents will throw an undefined field error. All the documents
 currently in the index will retain that field.

 Why you're hitting an OOM is a mystery though. But delete field isn't
 removing the contents if indexed documents. Showing us the full stack
 when you hit an OOM would be helpful.

 Best,
 Erick

 On Fri, May 29, 2015 at 4:58 PM, Joseph Obernberger
 j...@lovehorsepower.com wrote:

 Thank you Shawn - I'm referring to fields in the schema.  With Solr 5,
 you
 can delete fields from the schema.

 https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-DeleteaField

 -Joe


 On 5/29/2015 7:30 PM, Shawn Heisey wrote:

 On 5/29/2015 5:08 PM, Joseph Obernberger wrote:

 Hi All - I have a lot of fields to delete, but noticed that once I
 started deleting them, I quickly ran out of heap space.  Is
 delete-field a memory intensive operation?  Should I delete one field,
 wait a while, then delete the next?

 I'm not aware of a way to delete a field.  I may have a different
 definition of what a field is than you do, though.

 Solr lets you delete entire documents, but deleting a field from the
 entire index would involve re-indexing every document in the index,
 excluding that field.

 Can you be more specific about exactly what you are doing, what you are
 seeing, and what you want to see instead?

 Also, please be aware of this:

 http://people.apache.org/~hossman/#threadhijack

 Thanks,
 Shawn





Re: Deleting Fields

2015-05-30 Thread Joseph Obernberger
)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:463)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)

at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
at 
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)

at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space



-Joe

On 5/30/2015 12:32 AM, Erick Erickson wrote:

Yes, but deleting fields from the schema only means that _future_
documents will throw an undefined field error. All the documents
currently in the index will retain that field.

Why you're hitting an OOM is a mystery though. But delete field isn't
removing the contents if indexed documents. Showing us the full stack
when you hit an OOM would be helpful.

Best,
Erick

On Fri, May 29, 2015 at 4:58 PM, Joseph Obernberger
j...@lovehorsepower.com wrote:

Thank you Shawn - I'm referring to fields in the schema.  With Solr 5, you
can delete fields from the schema.
https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-DeleteaField

-Joe


On 5/29/2015 7:30 PM, Shawn Heisey wrote:

On 5/29/2015 5:08 PM, Joseph Obernberger wrote:

Hi All - I have a lot of fields to delete, but noticed that once I
started deleting them, I quickly ran out of heap space.  Is
delete-field a memory intensive operation?  Should I delete one field,
wait a while, then delete the next?

I'm not aware of a way to delete a field.  I may have a different
definition of what a field is than you do, though.

Solr lets you delete entire documents, but deleting a field from the
entire index would involve re-indexing every document in the index,
excluding that field.

Can you be more specific about exactly what you are doing, what you are
seeing, and what you want to see instead?

Also, please be aware of this:

http://people.apache.org/~hossman/#threadhijack

Thanks,
Shawn






Re: Deleting Fields

2015-05-30 Thread Steve Rowe
Hi Joseph,

 On May 30, 2015, at 8:18 AM, Joseph Obernberger j...@lovehorsepower.com 
 wrote:
 
 Thank you Erick.  I was thinking that it actually went through and removed 
 the index data; that you for the clarification.

I added more info to the Schema API page about this not being true.  Here’s 
what I’ve got so far - let me know if you think we should add more warnings 
about this:

-
Re-index after schema modifications!

If you modify your schema, you will likely need to re-index all documents. If 
you do not, you may lose access to documents, or not be able to interpret them 
properly, e.g. after replacing a field type.

Modifying your schema will never modify any documents that are already indexed. 
Again, you must re-index documents in order to apply schema changes to them.

[…]

When modifying the schema with the API, a core reload will automatically occur 
in order for the changes to be available immediately for documents indexed 
thereafter.  Previously indexed documents will not be automatically handled - 
they must be re-indexed if they used schema elements that you changed.
-

Steve

Re: Deleting Fields

2015-05-29 Thread Shawn Heisey
On 5/29/2015 5:08 PM, Joseph Obernberger wrote:
 Hi All - I have a lot of fields to delete, but noticed that once I
 started deleting them, I quickly ran out of heap space.  Is
 delete-field a memory intensive operation?  Should I delete one field,
 wait a while, then delete the next?

I'm not aware of a way to delete a field.  I may have a different
definition of what a field is than you do, though.

Solr lets you delete entire documents, but deleting a field from the
entire index would involve re-indexing every document in the index,
excluding that field.

Can you be more specific about exactly what you are doing, what you are
seeing, and what you want to see instead?

Also, please be aware of this:

http://people.apache.org/~hossman/#threadhijack

Thanks,
Shawn



Deleting Fields

2015-05-29 Thread Joseph Obernberger
Hi All - I have a lot of fields to delete, but noticed that once I 
started deleting them, I quickly ran out of heap space.  Is delete-field 
a memory intensive operation?  Should I delete one field, wait a while, 
then delete the next?

Thank you!

-Joe


Re: Deleting Fields

2015-05-29 Thread Joseph Obernberger
Thank you Shawn - I'm referring to fields in the schema.  With Solr 5, 
you can delete fields from the schema.

https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-DeleteaField

-Joe

On 5/29/2015 7:30 PM, Shawn Heisey wrote:

On 5/29/2015 5:08 PM, Joseph Obernberger wrote:

Hi All - I have a lot of fields to delete, but noticed that once I
started deleting them, I quickly ran out of heap space.  Is
delete-field a memory intensive operation?  Should I delete one field,
wait a while, then delete the next?

I'm not aware of a way to delete a field.  I may have a different
definition of what a field is than you do, though.

Solr lets you delete entire documents, but deleting a field from the
entire index would involve re-indexing every document in the index,
excluding that field.

Can you be more specific about exactly what you are doing, what you are
seeing, and what you want to see instead?

Also, please be aware of this:

http://people.apache.org/~hossman/#threadhijack

Thanks,
Shawn






Re: Deleting Fields

2015-05-29 Thread Erick Erickson
Yes, but deleting fields from the schema only means that _future_
documents will throw an undefined field error. All the documents
currently in the index will retain that field.

Why you're hitting an OOM is a mystery though. But delete field isn't
removing the contents if indexed documents. Showing us the full stack
when you hit an OOM would be helpful.

Best,
Erick

On Fri, May 29, 2015 at 4:58 PM, Joseph Obernberger
j...@lovehorsepower.com wrote:
 Thank you Shawn - I'm referring to fields in the schema.  With Solr 5, you
 can delete fields from the schema.
 https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-DeleteaField

 -Joe


 On 5/29/2015 7:30 PM, Shawn Heisey wrote:

 On 5/29/2015 5:08 PM, Joseph Obernberger wrote:

 Hi All - I have a lot of fields to delete, but noticed that once I
 started deleting them, I quickly ran out of heap space.  Is
 delete-field a memory intensive operation?  Should I delete one field,
 wait a while, then delete the next?

 I'm not aware of a way to delete a field.  I may have a different
 definition of what a field is than you do, though.

 Solr lets you delete entire documents, but deleting a field from the
 entire index would involve re-indexing every document in the index,
 excluding that field.

 Can you be more specific about exactly what you are doing, what you are
 seeing, and what you want to see instead?

 Also, please be aware of this:

 http://people.apache.org/~hossman/#threadhijack

 Thanks,
 Shawn