Re: Apache Manifold 2.10

2018-11-30 Thread Karl Wright
Hi Krishna,

First of all I suggest that you *not* use multiprocess-file-example, and
instead use multiprocess-zk-example.

Your symptoms suggest many possibilities.  But if you move to Zookeeper we
will be able to eliminate dangling file locks as a complication.  So please
do that first.

Karl


On Fri, Nov 30, 2018 at 6:29 PM krishna agrawal 
wrote:

> Yeah in our local set up we did Simple example but in  server we did
> multiprocess-file-example are you suggesting us to upgrade from 2.10 to
> 2.11 ?
>
> and we are using MY Sql database ,
>
> So most of time i saw nothing is running and still it say job is running
> and you have to wait for it to complete.
>
> and restarting also not helping.
>
> Any other solution woould be greatly appreciated.
>
> Thanks,
> Krishna A
>
> On Fri, Nov 30, 2018 at 10:50 AM Karl Wright  wrote:
>
> > It also may be useful to start with the simple example, which is not
> > multiprocess, and get familiar with using ManifoldCF that way, before you
> > try to go to a more complicated setup.
> >
> > Thanks,
> > Karl
> >
> >
> > On Fri, Nov 30, 2018 at 9:46 AM Karl Wright  wrote:
> >
> > > "simplified multi-process"?  There is no such example.
> > >
> > > These are the examples available.  Which one are you using?
> > >
> > > 11/15/2018  03:40 AM  example
> > > 11/15/2018  03:40 AM  example-proprietary
> > > 11/15/2018  03:40 AM  multiprocess-file-example
> > > 11/15/2018  03:40 AM
> > > multiprocess-file-example-proprietary
> > > 11/15/2018  03:40 AM  multiprocess-zk-example
> > > 11/15/2018  03:40 AM
> > multiprocess-zk-example-proprietary
> > >
> > > Cleaning locks makes no sense unless you are using the
> multiprocess-file
> > > setup.  This is deprecated, by the way, in favor of the Zookeeper
> setup.
> > >
> > > As for the buttons, please read:
> > >
> > >
> > >
> >
> https://manifoldcf.apache.org/release/release-2.11/en_US/end-user-documentation.html#outputs
> > >
> > > The buttons in question are "Reindex all..." and "Remove all..."
> > >
> > > Karl
> > >
> > >
> > > On Fri, Nov 30, 2018 at 9:36 AM krishna agrawal 
> > > wrote:
> > >
> > >> We have deployed the Manifold using
> > >>
> > >>- Simplified multi-process model
> > >>
> > >> We did try clean up of lock Sh but that also did not work.
> > >>
> > >> I dont have forget all document button in output connector.
> > >>
> > >> [image: image.png]
> > >>
> > >> On Thu, Nov 29, 2018 at 6:52 PM Karl Wright 
> wrote:
> > >>
> > >>> Hi Krishna,
> > >>>
> > >>> Please give us some background as to how you've deployed ManifoldCF.
> > Are
> > >>> you using one of the examples?  If so, which one?
> > >>>
> > >>> The detailed answer to your question is: the job must delete all
> > >>> documents
> > >>> it indexed before it can be deleted.  That is the typical way jobs
> > work.
> > >>> Thus, if you shut down the target of your output connection, you may
> be
> > >>> blocked in deleting your job.
> > >>>
> > >>> At that point, you can either (a) restart the target of your output
> > >>> connection, or (b) go to the "view" page for the output connection
> and
> > >>> click both of the "forget all documents" buttons on it.  (b) is not
> > >>> recommended unless you really want to start over fresh on your output
> > >>> index.
> > >>>
> > >>> Thanks,
> > >>> Karl
> > >>>
> > >>>
> > >>> On Thu, Nov 29, 2018 at 3:21 PM krishna agrawal <
> krish.a...@gmail.com>
> > >>> wrote:
> > >>>
> > >>> > Hi We are facing issue of action button is not available
> > >>> >
> > >>> > [image: image.png]
> > >>> >
> > >>> > I have stop the agent process but still  i am not able to remove
> the
> > >>> job
> > >>> > it say it
> > >>> >
> > >>> > there should be some way to forcefully restart and stop the running
> > >>> > process ?
> > >>> >
> > >>> > Job 1542835910915 is busy; you must wait and/or shut it down before
> > >>> > deleting it
> > >>> > but there is no job running, and i am seeing this message from
> past 3
> > >>> days.
> > >>> >
> > >>> > is there any ways to clear this?
> > >>> >
> > >>> >
> > >>> > Any help in this matter will be appreciated.
> > >>> >
> > >>> > Thanks,
> > >>> > Krishna A
> > >>> >
> > >>>
> > >>
> >
>


Re: Apache Manifold 2.10

2018-11-30 Thread krishna agrawal
Yeah in our local set up we did Simple example but in  server we did
multiprocess-file-example are you suggesting us to upgrade from 2.10 to
2.11 ?

and we are using MY Sql database ,

So most of time i saw nothing is running and still it say job is running
and you have to wait for it to complete.

and restarting also not helping.

Any other solution woould be greatly appreciated.

Thanks,
Krishna A

On Fri, Nov 30, 2018 at 10:50 AM Karl Wright  wrote:

> It also may be useful to start with the simple example, which is not
> multiprocess, and get familiar with using ManifoldCF that way, before you
> try to go to a more complicated setup.
>
> Thanks,
> Karl
>
>
> On Fri, Nov 30, 2018 at 9:46 AM Karl Wright  wrote:
>
> > "simplified multi-process"?  There is no such example.
> >
> > These are the examples available.  Which one are you using?
> >
> > 11/15/2018  03:40 AM  example
> > 11/15/2018  03:40 AM  example-proprietary
> > 11/15/2018  03:40 AM  multiprocess-file-example
> > 11/15/2018  03:40 AM
> > multiprocess-file-example-proprietary
> > 11/15/2018  03:40 AM  multiprocess-zk-example
> > 11/15/2018  03:40 AM
> multiprocess-zk-example-proprietary
> >
> > Cleaning locks makes no sense unless you are using the multiprocess-file
> > setup.  This is deprecated, by the way, in favor of the Zookeeper setup.
> >
> > As for the buttons, please read:
> >
> >
> >
> https://manifoldcf.apache.org/release/release-2.11/en_US/end-user-documentation.html#outputs
> >
> > The buttons in question are "Reindex all..." and "Remove all..."
> >
> > Karl
> >
> >
> > On Fri, Nov 30, 2018 at 9:36 AM krishna agrawal 
> > wrote:
> >
> >> We have deployed the Manifold using
> >>
> >>- Simplified multi-process model
> >>
> >> We did try clean up of lock Sh but that also did not work.
> >>
> >> I dont have forget all document button in output connector.
> >>
> >> [image: image.png]
> >>
> >> On Thu, Nov 29, 2018 at 6:52 PM Karl Wright  wrote:
> >>
> >>> Hi Krishna,
> >>>
> >>> Please give us some background as to how you've deployed ManifoldCF.
> Are
> >>> you using one of the examples?  If so, which one?
> >>>
> >>> The detailed answer to your question is: the job must delete all
> >>> documents
> >>> it indexed before it can be deleted.  That is the typical way jobs
> work.
> >>> Thus, if you shut down the target of your output connection, you may be
> >>> blocked in deleting your job.
> >>>
> >>> At that point, you can either (a) restart the target of your output
> >>> connection, or (b) go to the "view" page for the output connection and
> >>> click both of the "forget all documents" buttons on it.  (b) is not
> >>> recommended unless you really want to start over fresh on your output
> >>> index.
> >>>
> >>> Thanks,
> >>> Karl
> >>>
> >>>
> >>> On Thu, Nov 29, 2018 at 3:21 PM krishna agrawal 
> >>> wrote:
> >>>
> >>> > Hi We are facing issue of action button is not available
> >>> >
> >>> > [image: image.png]
> >>> >
> >>> > I have stop the agent process but still  i am not able to remove the
> >>> job
> >>> > it say it
> >>> >
> >>> > there should be some way to forcefully restart and stop the running
> >>> > process ?
> >>> >
> >>> > Job 1542835910915 is busy; you must wait and/or shut it down before
> >>> > deleting it
> >>> > but there is no job running, and i am seeing this message from past 3
> >>> days.
> >>> >
> >>> > is there any ways to clear this?
> >>> >
> >>> >
> >>> > Any help in this matter will be appreciated.
> >>> >
> >>> > Thanks,
> >>> > Krishna A
> >>> >
> >>>
> >>
>


[jira] [Commented] (CONNECTORS-1560) Improve tika-server robustness via -spawnChild

2018-11-30 Thread Tim Allison (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705375#comment-16705375
 ] 

Tim Allison commented on CONNECTORS-1560:
-

Doh. Sorry!

> Improve tika-server robustness via -spawnChild
> --
>
> Key: CONNECTORS-1560
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1560
> Project: ManifoldCF
>  Issue Type: Wish
>Reporter: Tim Allison
>Priority: Major
>
> I'd encourage you to consider adopting the new {{-spawnChild}} mode in 
> tika-server.  See the documentation here: 
> https://wiki.apache.org/tika/TikaJAXRS#Making%20Tika%20Server%20Robust%20to%20OOMs,%20Infinite%20Loops%20and%20Memory%20Leaks
> The small downside is that the server can go down for a few seconds during 
> the restart.   Clients have to be prepared for an IOException on files that 
> are being parsed when the child server goes down and/or if the child is being 
> restarted.  The upside is that your users will be protected against infinite 
> loops, OOM and memory leaks...things that we used to just hope never 
> happened...but they do, and they will.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1560) Improve tika-server robustness via -spawnChild

2018-11-30 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705329#comment-16705329
 ] 

Karl Wright commented on CONNECTORS-1560:
-

[~talli...@apache.org], ManifoldCF does not ship the Tika Server.  We provide a 
transformation connector that talks to it, but that is all.  There is also an 
embedded Tika transformer which works for many people, but if people run into 
difficulties with it we recommend using the external server and setting it up 
themselves.




> Improve tika-server robustness via -spawnChild
> --
>
> Key: CONNECTORS-1560
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1560
> Project: ManifoldCF
>  Issue Type: Wish
>Reporter: Tim Allison
>Priority: Major
>
> I'd encourage you to consider adopting the new {{-spawnChild}} mode in 
> tika-server.  See the documentation here: 
> https://wiki.apache.org/tika/TikaJAXRS#Making%20Tika%20Server%20Robust%20to%20OOMs,%20Infinite%20Loops%20and%20Memory%20Leaks
> The small downside is that the server can go down for a few seconds during 
> the restart.   Clients have to be prepared for an IOException on files that 
> are being parsed when the child server goes down and/or if the child is being 
> restarted.  The upside is that your users will be protected against infinite 
> loops, OOM and memory leaks...things that we used to just hope never 
> happened...but they do, and they will.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1560) Improve tika-server robustness via -spawnChild

2018-11-30 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1560.
-
Resolution: Won't Fix

> Improve tika-server robustness via -spawnChild
> --
>
> Key: CONNECTORS-1560
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1560
> Project: ManifoldCF
>  Issue Type: Wish
>Reporter: Tim Allison
>Priority: Major
>
> I'd encourage you to consider adopting the new {{-spawnChild}} mode in 
> tika-server.  See the documentation here: 
> https://wiki.apache.org/tika/TikaJAXRS#Making%20Tika%20Server%20Robust%20to%20OOMs,%20Infinite%20Loops%20and%20Memory%20Leaks
> The small downside is that the server can go down for a few seconds during 
> the restart.   Clients have to be prepared for an IOException on files that 
> are being parsed when the child server goes down and/or if the child is being 
> restarted.  The upside is that your users will be protected against infinite 
> loops, OOM and memory leaks...things that we used to just hope never 
> happened...but they do, and they will.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CONNECTORS-1561) Upgrade to Tika 1.20 when available

2018-11-30 Thread Tim Allison (JIRA)
Tim Allison created CONNECTORS-1561:
---

 Summary: Upgrade to Tika 1.20 when available
 Key: CONNECTORS-1561
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1561
 Project: ManifoldCF
  Issue Type: Improvement
Reporter: Tim Allison


On TIKA-2776, a ManifoldCF user alerted us to a bad bug in Tika 1.19 and 1.19.1 
that causes tika-server to return 503 forever after it hits an OOM.  This is 
bad.  We'll be rolling a fix out in a week or two in Tika 1.20.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CONNECTORS-1560) Improve tika-server robustness via -spawnChild

2018-11-30 Thread Tim Allison (JIRA)
Tim Allison created CONNECTORS-1560:
---

 Summary: Improve tika-server robustness via -spawnChild
 Key: CONNECTORS-1560
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1560
 Project: ManifoldCF
  Issue Type: Wish
Reporter: Tim Allison


I'd encourage you to consider adopting the new {{-spawnChild}} mode in 
tika-server.  See the documentation here: 
https://wiki.apache.org/tika/TikaJAXRS#Making%20Tika%20Server%20Robust%20to%20OOMs,%20Infinite%20Loops%20and%20Memory%20Leaks

The small downside is that the server can go down for a few seconds during the 
restart.   Clients have to be prepared for an IOException on files that are 
being parsed when the child server goes down and/or if the child is being 
restarted.  The upside is that your users will be protected against infinite 
loops, OOM and memory leaks...things that we used to just hope never 
happened...but they do, and they will.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Apache Manifold 2.10

2018-11-30 Thread Karl Wright
It also may be useful to start with the simple example, which is not
multiprocess, and get familiar with using ManifoldCF that way, before you
try to go to a more complicated setup.

Thanks,
Karl


On Fri, Nov 30, 2018 at 9:46 AM Karl Wright  wrote:

> "simplified multi-process"?  There is no such example.
>
> These are the examples available.  Which one are you using?
>
> 11/15/2018  03:40 AM  example
> 11/15/2018  03:40 AM  example-proprietary
> 11/15/2018  03:40 AM  multiprocess-file-example
> 11/15/2018  03:40 AM
> multiprocess-file-example-proprietary
> 11/15/2018  03:40 AM  multiprocess-zk-example
> 11/15/2018  03:40 AM  multiprocess-zk-example-proprietary
>
> Cleaning locks makes no sense unless you are using the multiprocess-file
> setup.  This is deprecated, by the way, in favor of the Zookeeper setup.
>
> As for the buttons, please read:
>
>
> https://manifoldcf.apache.org/release/release-2.11/en_US/end-user-documentation.html#outputs
>
> The buttons in question are "Reindex all..." and "Remove all..."
>
> Karl
>
>
> On Fri, Nov 30, 2018 at 9:36 AM krishna agrawal 
> wrote:
>
>> We have deployed the Manifold using
>>
>>- Simplified multi-process model
>>
>> We did try clean up of lock Sh but that also did not work.
>>
>> I dont have forget all document button in output connector.
>>
>> [image: image.png]
>>
>> On Thu, Nov 29, 2018 at 6:52 PM Karl Wright  wrote:
>>
>>> Hi Krishna,
>>>
>>> Please give us some background as to how you've deployed ManifoldCF.  Are
>>> you using one of the examples?  If so, which one?
>>>
>>> The detailed answer to your question is: the job must delete all
>>> documents
>>> it indexed before it can be deleted.  That is the typical way jobs work.
>>> Thus, if you shut down the target of your output connection, you may be
>>> blocked in deleting your job.
>>>
>>> At that point, you can either (a) restart the target of your output
>>> connection, or (b) go to the "view" page for the output connection and
>>> click both of the "forget all documents" buttons on it.  (b) is not
>>> recommended unless you really want to start over fresh on your output
>>> index.
>>>
>>> Thanks,
>>> Karl
>>>
>>>
>>> On Thu, Nov 29, 2018 at 3:21 PM krishna agrawal 
>>> wrote:
>>>
>>> > Hi We are facing issue of action button is not available
>>> >
>>> > [image: image.png]
>>> >
>>> > I have stop the agent process but still  i am not able to remove the
>>> job
>>> > it say it
>>> >
>>> > there should be some way to forcefully restart and stop the running
>>> > process ?
>>> >
>>> > Job 1542835910915 is busy; you must wait and/or shut it down before
>>> > deleting it
>>> > but there is no job running, and i am seeing this message from past 3
>>> days.
>>> >
>>> > is there any ways to clear this?
>>> >
>>> >
>>> > Any help in this matter will be appreciated.
>>> >
>>> > Thanks,
>>> > Krishna A
>>> >
>>>
>>


Re: Apache Manifold 2.10

2018-11-30 Thread Karl Wright
"simplified multi-process"?  There is no such example.

These are the examples available.  Which one are you using?

11/15/2018  03:40 AM  example
11/15/2018  03:40 AM  example-proprietary
11/15/2018  03:40 AM  multiprocess-file-example
11/15/2018  03:40 AM  multiprocess-file-example-proprietary
11/15/2018  03:40 AM  multiprocess-zk-example
11/15/2018  03:40 AM  multiprocess-zk-example-proprietary

Cleaning locks makes no sense unless you are using the multiprocess-file
setup.  This is deprecated, by the way, in favor of the Zookeeper setup.

As for the buttons, please read:

https://manifoldcf.apache.org/release/release-2.11/en_US/end-user-documentation.html#outputs

The buttons in question are "Reindex all..." and "Remove all..."

Karl


On Fri, Nov 30, 2018 at 9:36 AM krishna agrawal 
wrote:

> We have deployed the Manifold using
>
>- Simplified multi-process model
>
> We did try clean up of lock Sh but that also did not work.
>
> I dont have forget all document button in output connector.
>
> [image: image.png]
>
> On Thu, Nov 29, 2018 at 6:52 PM Karl Wright  wrote:
>
>> Hi Krishna,
>>
>> Please give us some background as to how you've deployed ManifoldCF.  Are
>> you using one of the examples?  If so, which one?
>>
>> The detailed answer to your question is: the job must delete all documents
>> it indexed before it can be deleted.  That is the typical way jobs work.
>> Thus, if you shut down the target of your output connection, you may be
>> blocked in deleting your job.
>>
>> At that point, you can either (a) restart the target of your output
>> connection, or (b) go to the "view" page for the output connection and
>> click both of the "forget all documents" buttons on it.  (b) is not
>> recommended unless you really want to start over fresh on your output
>> index.
>>
>> Thanks,
>> Karl
>>
>>
>> On Thu, Nov 29, 2018 at 3:21 PM krishna agrawal 
>> wrote:
>>
>> > Hi We are facing issue of action button is not available
>> >
>> > [image: image.png]
>> >
>> > I have stop the agent process but still  i am not able to remove the job
>> > it say it
>> >
>> > there should be some way to forcefully restart and stop the running
>> > process ?
>> >
>> > Job 1542835910915 is busy; you must wait and/or shut it down before
>> > deleting it
>> > but there is no job running, and i am seeing this message from past 3
>> days.
>> >
>> > is there any ways to clear this?
>> >
>> >
>> > Any help in this matter will be appreciated.
>> >
>> > Thanks,
>> > Krishna A
>> >
>>
>


Re: Apache Manifold 2.10

2018-11-30 Thread krishna agrawal
We have deployed the Manifold using

   - Simplified multi-process model

We did try clean up of lock Sh but that also did not work.

I dont have forget all document button in output connector.

[image: image.png]

On Thu, Nov 29, 2018 at 6:52 PM Karl Wright  wrote:

> Hi Krishna,
>
> Please give us some background as to how you've deployed ManifoldCF.  Are
> you using one of the examples?  If so, which one?
>
> The detailed answer to your question is: the job must delete all documents
> it indexed before it can be deleted.  That is the typical way jobs work.
> Thus, if you shut down the target of your output connection, you may be
> blocked in deleting your job.
>
> At that point, you can either (a) restart the target of your output
> connection, or (b) go to the "view" page for the output connection and
> click both of the "forget all documents" buttons on it.  (b) is not
> recommended unless you really want to start over fresh on your output
> index.
>
> Thanks,
> Karl
>
>
> On Thu, Nov 29, 2018 at 3:21 PM krishna agrawal 
> wrote:
>
> > Hi We are facing issue of action button is not available
> >
> > [image: image.png]
> >
> > I have stop the agent process but still  i am not able to remove the job
> > it say it
> >
> > there should be some way to forcefully restart and stop the running
> > process ?
> >
> > Job 1542835910915 is busy; you must wait and/or shut it down before
> > deleting it
> > but there is no job running, and i am seeing this message from past 3
> days.
> >
> > is there any ways to clear this?
> >
> >
> > Any help in this matter will be appreciated.
> >
> > Thanks,
> > Krishna A
> >
>