Re: sharepoint crawler documents limit

2019-12-20 Thread Karl Wright
Hi Priya,

This has nothing to do with anything in ManifoldCF.

Karl


On Fri, Dec 20, 2019 at 7:56 AM Priya Arora  wrote:

> Hi All,
>
> Is this issue something to have with below value/parameters set in
> properties.xml.
> [image: image.png]
>
>
> On Fri, Dec 20, 2019 at 5:21 PM Jorge Alonso Garcia 
> wrote:
>
>> And what other sharepoint parameter I could check?
>>
>> Jorge Alonso Garcia
>>
>>
>>
>> El vie., 20 dic. 2019 a las 12:47, Karl Wright ()
>> escribió:
>>
>>> The code seems correct and many people are using it without encountering
>>> this problem.  There may be another SharePoint configuration parameter you
>>> also need to look at somewhere.
>>>
>>> Karl
>>>
>>>
>>> On Fri, Dec 20, 2019 at 6:38 AM Jorge Alonso Garcia 
>>> wrote:
>>>

 Hi Karl,
 On sharepoint the list view threshold is 150,000 but we only receipt
 20,000 from mcf
 [image: image.png]


 Jorge Alonso Garcia



 El jue., 19 dic. 2019 a las 19:19, Karl Wright ()
 escribió:

> If the job finished without error it implies that the number of
> documents returned from this one library was 1 when the service is
> called the first time (starting at doc 0), 1 when it's called the
> second time (starting at doc 1), and zero when it is called the third
> time (starting at doc 2).
>
> The plugin code is unremarkable and actually gets results in chunks of
> 1000 under the covers:
>
> >>
> SPQuery listQuery = new SPQuery();
> listQuery.Query = " Override=\"TRUE\">";
> listQuery.QueryThrottleMode =
> SPQueryThrottleOption.Override;
> listQuery.ViewAttributes =
> "Scope=\"Recursive\"";
> listQuery.ViewFields = " Name='FileRef' />";
> listQuery.RowLimit = 1000;
>
> XmlDocument doc = new XmlDocument();
> retVal = doc.CreateElement("GetListItems",
> "
> http://schemas.microsoft.com/sharepoint/soap/directory/;);
> XmlNode getListItemsNode =
> doc.CreateElement("GetListItemsResponse");
>
> uint counter = 0;
> do
> {
> if (counter >= startRowParam +
> rowLimitParam)
> break;
>
> SPListItemCollection collListItems =
> oList.GetItems(listQuery);
>
>
> foreach (SPListItem oListItem in
> collListItems)
> {
> if (counter >= startRowParam &&
> counter < startRowParam + rowLimitParam)
> {
> XmlNode resultNode =
> doc.CreateElement("GetListItemsResult");
> XmlAttribute idAttribute =
> doc.CreateAttribute("FileRef");
> idAttribute.Value = oListItem.Url;
>
> resultNode.Attributes.Append(idAttribute);
> XmlAttribute urlAttribute =
> doc.CreateAttribute("ListItemURL");
> //urlAttribute.Value =
> oListItem.ParentList.DefaultViewUrl;
> urlAttribute.Value =
> string.Format("{0}?ID={1}",
> oListItem.ParentList.Forms[PAGETYPE.PAGE_DISPLAYFORM].ServerRelativeUrl,
> oListItem.ID);
>
> resultNode.Attributes.Append(urlAttribute);
>
> getListItemsNode.AppendChild(resultNode);
> }
> counter++;
> }
>
> listQuery.ListItemCollectionPosition =
> collListItems.ListItemCollectionPosition;
>
> } while (listQuery.ListItemCollectionPosition
> != null);
>
> retVal.AppendChild(getListItemsNode);
> <<
>
> The code is clearly working if you get 2 results returned, so I
> submit that perhaps there's a configured limit in your SharePoint instance
> that prevents listing more than 2.  That's the only way I can explain
> this.
>
> Karl
>
>
> On Thu, Dec 19, 2019 at 12:51 PM Jorge Alonso Garcia <
> jalon...@gmail.com> wrote:
>
>> Hi,
>> The job finnish ok (several times) but always with this 2
>> documents, for some reason the loop only execute twice
>>
>> Jorge Alonso Garcia
>>
>>
>>
>> El jue., 19 dic. 2019 a las 18:14, Karl Wright ()
>> escribió:
>>
>>> If the are all in one document, then you'd be running this 

Re: sharepoint crawler documents limit

2019-12-20 Thread Priya Arora
Hi All,

Is this issue something to have with below value/parameters set in
properties.xml.
[image: image.png]


On Fri, Dec 20, 2019 at 5:21 PM Jorge Alonso Garcia 
wrote:

> And what other sharepoint parameter I could check?
>
> Jorge Alonso Garcia
>
>
>
> El vie., 20 dic. 2019 a las 12:47, Karl Wright ()
> escribió:
>
>> The code seems correct and many people are using it without encountering
>> this problem.  There may be another SharePoint configuration parameter you
>> also need to look at somewhere.
>>
>> Karl
>>
>>
>> On Fri, Dec 20, 2019 at 6:38 AM Jorge Alonso Garcia 
>> wrote:
>>
>>>
>>> Hi Karl,
>>> On sharepoint the list view threshold is 150,000 but we only receipt
>>> 20,000 from mcf
>>> [image: image.png]
>>>
>>>
>>> Jorge Alonso Garcia
>>>
>>>
>>>
>>> El jue., 19 dic. 2019 a las 19:19, Karl Wright ()
>>> escribió:
>>>
 If the job finished without error it implies that the number of
 documents returned from this one library was 1 when the service is
 called the first time (starting at doc 0), 1 when it's called the
 second time (starting at doc 1), and zero when it is called the third
 time (starting at doc 2).

 The plugin code is unremarkable and actually gets results in chunks of
 1000 under the covers:

 >>
 SPQuery listQuery = new SPQuery();
 listQuery.Query = ">>> Override=\"TRUE\">";
 listQuery.QueryThrottleMode =
 SPQueryThrottleOption.Override;
 listQuery.ViewAttributes =
 "Scope=\"Recursive\"";
 listQuery.ViewFields = ">>> Name='FileRef' />";
 listQuery.RowLimit = 1000;

 XmlDocument doc = new XmlDocument();
 retVal = doc.CreateElement("GetListItems",
 "
 http://schemas.microsoft.com/sharepoint/soap/directory/;);
 XmlNode getListItemsNode =
 doc.CreateElement("GetListItemsResponse");

 uint counter = 0;
 do
 {
 if (counter >= startRowParam +
 rowLimitParam)
 break;

 SPListItemCollection collListItems =
 oList.GetItems(listQuery);


 foreach (SPListItem oListItem in
 collListItems)
 {
 if (counter >= startRowParam && counter
 < startRowParam + rowLimitParam)
 {
 XmlNode resultNode =
 doc.CreateElement("GetListItemsResult");
 XmlAttribute idAttribute =
 doc.CreateAttribute("FileRef");
 idAttribute.Value = oListItem.Url;

 resultNode.Attributes.Append(idAttribute);
 XmlAttribute urlAttribute =
 doc.CreateAttribute("ListItemURL");
 //urlAttribute.Value =
 oListItem.ParentList.DefaultViewUrl;
 urlAttribute.Value =
 string.Format("{0}?ID={1}",
 oListItem.ParentList.Forms[PAGETYPE.PAGE_DISPLAYFORM].ServerRelativeUrl,
 oListItem.ID);

 resultNode.Attributes.Append(urlAttribute);

 getListItemsNode.AppendChild(resultNode);
 }
 counter++;
 }

 listQuery.ListItemCollectionPosition =
 collListItems.ListItemCollectionPosition;

 } while (listQuery.ListItemCollectionPosition
 != null);

 retVal.AppendChild(getListItemsNode);
 <<

 The code is clearly working if you get 2 results returned, so I
 submit that perhaps there's a configured limit in your SharePoint instance
 that prevents listing more than 2.  That's the only way I can explain
 this.

 Karl


 On Thu, Dec 19, 2019 at 12:51 PM Jorge Alonso Garcia <
 jalon...@gmail.com> wrote:

> Hi,
> The job finnish ok (several times) but always with this 2
> documents, for some reason the loop only execute twice
>
> Jorge Alonso Garcia
>
>
>
> El jue., 19 dic. 2019 a las 18:14, Karl Wright ()
> escribió:
>
>> If the are all in one document, then you'd be running this code:
>>
>> >>
>> int startingIndex = 0;
>> int amtToRequest = 1;
>> while (true)
>> {
>>
>> com.microsoft.sharepoint.webpartpages.GetListItemsResponseGetListItemsResult
>> itemsResult =
>>

Re: sharepoint crawler documents limit

2019-12-20 Thread Jorge Alonso Garcia
And what other sharepoint parameter I could check?

Jorge Alonso Garcia



El vie., 20 dic. 2019 a las 12:47, Karl Wright ()
escribió:

> The code seems correct and many people are using it without encountering
> this problem.  There may be another SharePoint configuration parameter you
> also need to look at somewhere.
>
> Karl
>
>
> On Fri, Dec 20, 2019 at 6:38 AM Jorge Alonso Garcia 
> wrote:
>
>>
>> Hi Karl,
>> On sharepoint the list view threshold is 150,000 but we only receipt
>> 20,000 from mcf
>> [image: image.png]
>>
>>
>> Jorge Alonso Garcia
>>
>>
>>
>> El jue., 19 dic. 2019 a las 19:19, Karl Wright ()
>> escribió:
>>
>>> If the job finished without error it implies that the number of
>>> documents returned from this one library was 1 when the service is
>>> called the first time (starting at doc 0), 1 when it's called the
>>> second time (starting at doc 1), and zero when it is called the third
>>> time (starting at doc 2).
>>>
>>> The plugin code is unremarkable and actually gets results in chunks of
>>> 1000 under the covers:
>>>
>>> >>
>>> SPQuery listQuery = new SPQuery();
>>> listQuery.Query = ">> Override=\"TRUE\">";
>>> listQuery.QueryThrottleMode =
>>> SPQueryThrottleOption.Override;
>>> listQuery.ViewAttributes = "Scope=\"Recursive\"";
>>> listQuery.ViewFields = ">> />";
>>> listQuery.RowLimit = 1000;
>>>
>>> XmlDocument doc = new XmlDocument();
>>> retVal = doc.CreateElement("GetListItems",
>>> "
>>> http://schemas.microsoft.com/sharepoint/soap/directory/;);
>>> XmlNode getListItemsNode =
>>> doc.CreateElement("GetListItemsResponse");
>>>
>>> uint counter = 0;
>>> do
>>> {
>>> if (counter >= startRowParam + rowLimitParam)
>>> break;
>>>
>>> SPListItemCollection collListItems =
>>> oList.GetItems(listQuery);
>>>
>>>
>>> foreach (SPListItem oListItem in
>>> collListItems)
>>> {
>>> if (counter >= startRowParam && counter
>>> < startRowParam + rowLimitParam)
>>> {
>>> XmlNode resultNode =
>>> doc.CreateElement("GetListItemsResult");
>>> XmlAttribute idAttribute =
>>> doc.CreateAttribute("FileRef");
>>> idAttribute.Value = oListItem.Url;
>>>
>>> resultNode.Attributes.Append(idAttribute);
>>> XmlAttribute urlAttribute =
>>> doc.CreateAttribute("ListItemURL");
>>> //urlAttribute.Value =
>>> oListItem.ParentList.DefaultViewUrl;
>>> urlAttribute.Value =
>>> string.Format("{0}?ID={1}",
>>> oListItem.ParentList.Forms[PAGETYPE.PAGE_DISPLAYFORM].ServerRelativeUrl,
>>> oListItem.ID);
>>>
>>> resultNode.Attributes.Append(urlAttribute);
>>>
>>> getListItemsNode.AppendChild(resultNode);
>>> }
>>> counter++;
>>> }
>>>
>>> listQuery.ListItemCollectionPosition =
>>> collListItems.ListItemCollectionPosition;
>>>
>>> } while (listQuery.ListItemCollectionPosition !=
>>> null);
>>>
>>> retVal.AppendChild(getListItemsNode);
>>> <<
>>>
>>> The code is clearly working if you get 2 results returned, so I
>>> submit that perhaps there's a configured limit in your SharePoint instance
>>> that prevents listing more than 2.  That's the only way I can explain
>>> this.
>>>
>>> Karl
>>>
>>>
>>> On Thu, Dec 19, 2019 at 12:51 PM Jorge Alonso Garcia 
>>> wrote:
>>>
 Hi,
 The job finnish ok (several times) but always with this 2
 documents, for some reason the loop only execute twice

 Jorge Alonso Garcia



 El jue., 19 dic. 2019 a las 18:14, Karl Wright ()
 escribió:

> If the are all in one document, then you'd be running this code:
>
> >>
> int startingIndex = 0;
> int amtToRequest = 1;
> while (true)
> {
>
> com.microsoft.sharepoint.webpartpages.GetListItemsResponseGetListItemsResult
> itemsResult =
>
> itemCall.getListItems(guid,Integer.toString(startingIndex),Integer.toString(amtToRequest));
>
>   MessageElement[] itemsList = itemsResult.get_any();
>
>   if (Logging.connectors.isDebugEnabled()){
> Logging.connectors.debug("SharePoint: getChildren xml
> response: " + itemsList[0].toString());
>   

Re: sharepoint crawler documents limit

2019-12-20 Thread Karl Wright
The code seems correct and many people are using it without encountering
this problem.  There may be another SharePoint configuration parameter you
also need to look at somewhere.

Karl


On Fri, Dec 20, 2019 at 6:38 AM Jorge Alonso Garcia 
wrote:

>
> Hi Karl,
> On sharepoint the list view threshold is 150,000 but we only receipt
> 20,000 from mcf
> [image: image.png]
>
>
> Jorge Alonso Garcia
>
>
>
> El jue., 19 dic. 2019 a las 19:19, Karl Wright ()
> escribió:
>
>> If the job finished without error it implies that the number of documents
>> returned from this one library was 1 when the service is called the
>> first time (starting at doc 0), 1 when it's called the second time
>> (starting at doc 1), and zero when it is called the third time
>> (starting at doc 2).
>>
>> The plugin code is unremarkable and actually gets results in chunks of
>> 1000 under the covers:
>>
>> >>
>> SPQuery listQuery = new SPQuery();
>> listQuery.Query = "> Override=\"TRUE\">";
>> listQuery.QueryThrottleMode =
>> SPQueryThrottleOption.Override;
>> listQuery.ViewAttributes = "Scope=\"Recursive\"";
>> listQuery.ViewFields = "> />";
>> listQuery.RowLimit = 1000;
>>
>> XmlDocument doc = new XmlDocument();
>> retVal = doc.CreateElement("GetListItems",
>> "
>> http://schemas.microsoft.com/sharepoint/soap/directory/;);
>> XmlNode getListItemsNode =
>> doc.CreateElement("GetListItemsResponse");
>>
>> uint counter = 0;
>> do
>> {
>> if (counter >= startRowParam + rowLimitParam)
>> break;
>>
>> SPListItemCollection collListItems =
>> oList.GetItems(listQuery);
>>
>>
>> foreach (SPListItem oListItem in
>> collListItems)
>> {
>> if (counter >= startRowParam && counter <
>> startRowParam + rowLimitParam)
>> {
>> XmlNode resultNode =
>> doc.CreateElement("GetListItemsResult");
>> XmlAttribute idAttribute =
>> doc.CreateAttribute("FileRef");
>> idAttribute.Value = oListItem.Url;
>>
>> resultNode.Attributes.Append(idAttribute);
>> XmlAttribute urlAttribute =
>> doc.CreateAttribute("ListItemURL");
>> //urlAttribute.Value =
>> oListItem.ParentList.DefaultViewUrl;
>> urlAttribute.Value =
>> string.Format("{0}?ID={1}",
>> oListItem.ParentList.Forms[PAGETYPE.PAGE_DISPLAYFORM].ServerRelativeUrl,
>> oListItem.ID);
>>
>> resultNode.Attributes.Append(urlAttribute);
>>
>> getListItemsNode.AppendChild(resultNode);
>> }
>> counter++;
>> }
>>
>> listQuery.ListItemCollectionPosition =
>> collListItems.ListItemCollectionPosition;
>>
>> } while (listQuery.ListItemCollectionPosition !=
>> null);
>>
>> retVal.AppendChild(getListItemsNode);
>> <<
>>
>> The code is clearly working if you get 2 results returned, so I
>> submit that perhaps there's a configured limit in your SharePoint instance
>> that prevents listing more than 2.  That's the only way I can explain
>> this.
>>
>> Karl
>>
>>
>> On Thu, Dec 19, 2019 at 12:51 PM Jorge Alonso Garcia 
>> wrote:
>>
>>> Hi,
>>> The job finnish ok (several times) but always with this 2 documents,
>>> for some reason the loop only execute twice
>>>
>>> Jorge Alonso Garcia
>>>
>>>
>>>
>>> El jue., 19 dic. 2019 a las 18:14, Karl Wright ()
>>> escribió:
>>>
 If the are all in one document, then you'd be running this code:

 >>
 int startingIndex = 0;
 int amtToRequest = 1;
 while (true)
 {

 com.microsoft.sharepoint.webpartpages.GetListItemsResponseGetListItemsResult
 itemsResult =

 itemCall.getListItems(guid,Integer.toString(startingIndex),Integer.toString(amtToRequest));

   MessageElement[] itemsList = itemsResult.get_any();

   if (Logging.connectors.isDebugEnabled()){
 Logging.connectors.debug("SharePoint: getChildren xml
 response: " + itemsList[0].toString());
   }

   if (itemsList.length != 1)
 throw new ManifoldCFException("Bad response - expecting one
 outer 'GetListItems' node, saw "+Integer.toString(itemsList.length));

   MessageElement items = itemsList[0];
   if
 

Re: sharepoint crawler documents limit

2019-12-20 Thread Jorge Alonso Garcia
Hi Karl,
On sharepoint the list view threshold is 150,000 but we only receipt 20,000
from mcf
[image: image.png]


Jorge Alonso Garcia



El jue., 19 dic. 2019 a las 19:19, Karl Wright ()
escribió:

> If the job finished without error it implies that the number of documents
> returned from this one library was 1 when the service is called the
> first time (starting at doc 0), 1 when it's called the second time
> (starting at doc 1), and zero when it is called the third time
> (starting at doc 2).
>
> The plugin code is unremarkable and actually gets results in chunks of
> 1000 under the covers:
>
> >>
> SPQuery listQuery = new SPQuery();
> listQuery.Query = " Override=\"TRUE\">";
> listQuery.QueryThrottleMode =
> SPQueryThrottleOption.Override;
> listQuery.ViewAttributes = "Scope=\"Recursive\"";
> listQuery.ViewFields = " />";
> listQuery.RowLimit = 1000;
>
> XmlDocument doc = new XmlDocument();
> retVal = doc.CreateElement("GetListItems",
> "
> http://schemas.microsoft.com/sharepoint/soap/directory/;);
> XmlNode getListItemsNode =
> doc.CreateElement("GetListItemsResponse");
>
> uint counter = 0;
> do
> {
> if (counter >= startRowParam + rowLimitParam)
> break;
>
> SPListItemCollection collListItems =
> oList.GetItems(listQuery);
>
>
> foreach (SPListItem oListItem in collListItems)
> {
> if (counter >= startRowParam && counter <
> startRowParam + rowLimitParam)
> {
> XmlNode resultNode =
> doc.CreateElement("GetListItemsResult");
> XmlAttribute idAttribute =
> doc.CreateAttribute("FileRef");
> idAttribute.Value = oListItem.Url;
>
> resultNode.Attributes.Append(idAttribute);
> XmlAttribute urlAttribute =
> doc.CreateAttribute("ListItemURL");
> //urlAttribute.Value =
> oListItem.ParentList.DefaultViewUrl;
> urlAttribute.Value =
> string.Format("{0}?ID={1}",
> oListItem.ParentList.Forms[PAGETYPE.PAGE_DISPLAYFORM].ServerRelativeUrl,
> oListItem.ID);
>
> resultNode.Attributes.Append(urlAttribute);
>
> getListItemsNode.AppendChild(resultNode);
> }
> counter++;
> }
>
> listQuery.ListItemCollectionPosition =
> collListItems.ListItemCollectionPosition;
>
> } while (listQuery.ListItemCollectionPosition !=
> null);
>
> retVal.AppendChild(getListItemsNode);
> <<
>
> The code is clearly working if you get 2 results returned, so I submit
> that perhaps there's a configured limit in your SharePoint instance that
> prevents listing more than 2.  That's the only way I can explain this.
>
> Karl
>
>
> On Thu, Dec 19, 2019 at 12:51 PM Jorge Alonso Garcia 
> wrote:
>
>> Hi,
>> The job finnish ok (several times) but always with this 2 documents,
>> for some reason the loop only execute twice
>>
>> Jorge Alonso Garcia
>>
>>
>>
>> El jue., 19 dic. 2019 a las 18:14, Karl Wright ()
>> escribió:
>>
>>> If the are all in one document, then you'd be running this code:
>>>
>>> >>
>>> int startingIndex = 0;
>>> int amtToRequest = 1;
>>> while (true)
>>> {
>>>
>>> com.microsoft.sharepoint.webpartpages.GetListItemsResponseGetListItemsResult
>>> itemsResult =
>>>
>>> itemCall.getListItems(guid,Integer.toString(startingIndex),Integer.toString(amtToRequest));
>>>
>>>   MessageElement[] itemsList = itemsResult.get_any();
>>>
>>>   if (Logging.connectors.isDebugEnabled()){
>>> Logging.connectors.debug("SharePoint: getChildren xml
>>> response: " + itemsList[0].toString());
>>>   }
>>>
>>>   if (itemsList.length != 1)
>>> throw new ManifoldCFException("Bad response - expecting one
>>> outer 'GetListItems' node, saw "+Integer.toString(itemsList.length));
>>>
>>>   MessageElement items = itemsList[0];
>>>   if
>>> (!items.getElementName().getLocalName().equals("GetListItems"))
>>> throw new ManifoldCFException("Bad response - outer node
>>> should have been 'GetListItems' node");
>>>
>>>   int resultCount = 0;
>>>   Iterator iter = items.getChildElements();
>>>   while (iter.hasNext())
>>>   {
>>> MessageElement child = (MessageElement)iter.next();
>>> 

Re: Manifoldcf server Error

2019-12-20 Thread Markus Schuch
Hi Priya,

the container you trying to interactivily executing a command with is no
longer running. It is not possible to execute command with stopped
containers.

The logger issues might be related to missing file system permissions.
But thats a wild guess. Is there a "Caused by" part in the stacktrace of
the AppenderLoggingException?

Cheers,
Markus

Am 20.12.2019 um 11:48 schrieb Priya Arora:
> Hi All,
>
> When i am trying to execute bash command inside manifoldcf container
> getting error. 
> image.png
> And when checking logs Sudo docker logs 
> 2019-12-19 18:09:05,848 Job start thread ERROR Unable to write to stream
> logs/manifoldcf.log for appender MyFile
> 2019-12-19 18:09:05,848 Seeding thread ERROR Unable to write to stream
> logs/manifoldcf.log for appender MyFile
> 2019-12-19 18:09:05,848 Job reset thread ERROR Unable to write to stream
> logs/manifoldcf.log for appender MyFile
> 2019-12-19 18:09:05,848 Job notification thread ERROR Unable to write to
> stream logs/manifoldcf.log for appender MyFile
> 2019-12-19 18:09:05,849 Seeding thread ERROR An exception occurred
> processing Appender MyFile org             
>  .apache.logging.log4j.core.appender.AppenderLoggingException: Error
> flushing stream logs/manifoldcf.log
>         at
> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159)
>
> Can any body suggest reason behind this error?
>
> Thanks
> Priya
>
> On Fri, Dec 20, 2019 at 3:37 PM Priya Arora  > wrote:
>
> Hi Markus,
>
> Many thanks for your reply!!.
>
> I tried this approach to reproduce the scenario in a different
> environment, but the case  where I listed the error above is when I
> am crawling INTRANET sites which can be accessible over a remote
> server. Also I have used Transformation connectors:-Allow Documents,
> Tika Parser, Content Limiter( 1000), Metadata Adjuster.
>
> When tried reproducing the error with Public sites of the same
> domain and on a different server(DEV), it was successful, with no
> error.Also there was no any postgres related error.
>
> Can it depends observer related configurations like Firewall etc, as
> this case include some firewall,security related configurations.
>
> Thanks 
> Priya
>
>
>
>
> On Fri, Dec 20, 2019 at 3:23 PM Markus Schuch  > wrote:
>
> Hi Priya,
>
> in my experience, i would focus on the OutOfMemoryError (OOME).
> 8 Gigs can be enough, but they don't have to.
>
> At first i would check if the jvm is really getting the desired heap
> size. The dockered environment make that a little harder find
> find out,
> since you need to get access to the jvm metrics, e.g. via jmxremote.
> Beeing able to monitor the jvm metrics helps you with
> correlating the
> errors with the heap and garbage collection activity.
>
> The errors you see on postgresql jdbc driver might be very
> related to
> the OOME.
>
> Some question i would ask myself:
>
> Do the problems repeatingly occur only when crawling this specific
> content source or only with this specific output connection? Can you
> reproduce it outside of docker in a controlled dev environment?
> Or is it
> a more general problem with your manifoldcf instance?
>
> May be there are some huge files beeing crawled in your content
> source?
> To you have any kind of transformations configured? (e.g.
> content size
> limit?) You should try to see in the job's history if there are any
> patterns, like the error rises always after encountering the same
> document xy.
>
> Cheers
> Markus
>
>
>
> Am 20.12.2019 um 09:59 schrieb Priya Arora:
> > Hi  Markus ,
> >
> > Heap size defined is 8GB. Manifoldcf start-options-unix file 
> Xmx etc
> > parameters is defined to have memory 8192mb.
> >
> > It seems to be an issue with memory also, and also when
> manifoldcf tries
> > to communicate to Database. Do you explicitly define somewhere
> > connection timer when to communicate to postgres.
> > Postgres is installed as a part of docker image pull and then some
> > changes in properties.xml(of manifoldcf) to connect to database.
> > On the other hand Elastic search is also holding sufficient
> memory and
> > Manifoldcf is also provided with 8 cores CPU.
> >
> > Can you suggest some solution.
> >
> > Thanks
> > Priya
> >
> > On Fri, Dec 20, 2019 at 2:23 PM Markus Schuch
> mailto:markus_sch...@web.de>
> > >>
> wrote:
> >
> >     Hi Priya,
> >
> 

Re: Manifoldcf server Error

2019-12-20 Thread Priya Arora
Hi All,

When i am trying to execute bash command inside manifoldcf container
getting error.
[image: image.png]
And when checking logs Sudo docker logs 
2019-12-19 18:09:05,848 Job start thread ERROR Unable to write to stream
logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,848 Seeding thread ERROR Unable to write to stream
logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,848 Job reset thread ERROR Unable to write to stream
logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,848 Job notification thread ERROR Unable to write to
stream logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,849 Seeding thread ERROR An exception occurred
processing Appender MyFile org
 .apache.logging.log4j.core.appender.AppenderLoggingException: Error
flushing stream logs/manifoldcf.log
at
org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159)

Can any body suggest reason behind this error?

Thanks
Priya

On Fri, Dec 20, 2019 at 3:37 PM Priya Arora  wrote:

> Hi Markus,
>
> Many thanks for your reply!!.
>
> I tried this approach to reproduce the scenario in a different
> environment, but the case  where I listed the error above is when I am
> crawling INTRANET sites which can be accessible over a remote server. Also
> I have used Transformation connectors:-Allow Documents, Tika Parser,
> Content Limiter( 1000), Metadata Adjuster.
>
> When tried reproducing the error with Public sites of the same domain and
> on a different server(DEV), it was successful, with no error.Also there was
> no any postgres related error.
>
> Can it depends observer related configurations like Firewall etc, as this
> case include some firewall,security related configurations.
>
> Thanks
> Priya
>
>
>
>
> On Fri, Dec 20, 2019 at 3:23 PM Markus Schuch 
> wrote:
>
>> Hi Priya,
>>
>> in my experience, i would focus on the OutOfMemoryError (OOME).
>> 8 Gigs can be enough, but they don't have to.
>>
>> At first i would check if the jvm is really getting the desired heap
>> size. The dockered environment make that a little harder find find out,
>> since you need to get access to the jvm metrics, e.g. via jmxremote.
>> Beeing able to monitor the jvm metrics helps you with correlating the
>> errors with the heap and garbage collection activity.
>>
>> The errors you see on postgresql jdbc driver might be very related to
>> the OOME.
>>
>> Some question i would ask myself:
>>
>> Do the problems repeatingly occur only when crawling this specific
>> content source or only with this specific output connection? Can you
>> reproduce it outside of docker in a controlled dev environment? Or is it
>> a more general problem with your manifoldcf instance?
>>
>> May be there are some huge files beeing crawled in your content source?
>> To you have any kind of transformations configured? (e.g. content size
>> limit?) You should try to see in the job's history if there are any
>> patterns, like the error rises always after encountering the same
>> document xy.
>>
>> Cheers
>> Markus
>>
>>
>>
>> Am 20.12.2019 um 09:59 schrieb Priya Arora:
>> > Hi  Markus ,
>> >
>> > Heap size defined is 8GB. Manifoldcf start-options-unix file  Xmx etc
>> > parameters is defined to have memory 8192mb.
>> >
>> > It seems to be an issue with memory also, and also when manifoldcf tries
>> > to communicate to Database. Do you explicitly define somewhere
>> > connection timer when to communicate to postgres.
>> > Postgres is installed as a part of docker image pull and then some
>> > changes in properties.xml(of manifoldcf) to connect to database.
>> > On the other hand Elastic search is also holding sufficient memory and
>> > Manifoldcf is also provided with 8 cores CPU.
>> >
>> > Can you suggest some solution.
>> >
>> > Thanks
>> > Priya
>> >
>> > On Fri, Dec 20, 2019 at 2:23 PM Markus Schuch > > > wrote:
>> >
>> > Hi Priya,
>> >
>> > your manifoldcf JVM suffers from high garbage collection pressure:
>> >
>> > java.lang.OutOfMemoryError: GC overhead limit exceeded
>> >
>> > What is your current heap size?
>> > Without knowing that, i suggest to increase the heap size. (java
>> > -Xmx...)
>> >
>> > Cheers,
>> > Markus
>> >
>> > Am 20.12.2019 um 09:02 schrieb Priya Arora:
>> > > Hi All,
>> > >
>> > > I am facing below error while accessing Manifoldcf. Requirement
>> is to
>> > > crawl data from a website using Repository as "Web" and Output
>> > connector
>> > > as "Elastic Search"
>> > > Manifoldcf is configured inside a docker container and also
>> > postgres is
>> > > used a docker container.
>> > > When launching manifold getting below error
>> > > image.png
>> > >
>> > > When checked logs:-
>> > > *1)sudo docker exec -it 0b872dfafc5c tail -1000
>> > > /usr/share/manifoldcf/example/logs/manifoldcf.log*
>> > > FATAL 2019-12-20T06:06:13,176 (Stuffer thread) - Error tossed:
>> 

Re: Manifoldcf server Error

2019-12-20 Thread Priya Arora
Hi Markus,

Many thanks for your reply!!.

I tried this approach to reproduce the scenario in a different environment,
but the case  where I listed the error above is when I am crawling INTRANET
sites which can be accessible over a remote server. Also I have used
Transformation connectors:-Allow Documents, Tika Parser, Content Limiter(
1000), Metadata Adjuster.

When tried reproducing the error with Public sites of the same domain and
on a different server(DEV), it was successful, with no error.Also there was
no any postgres related error.

Can it depends observer related configurations like Firewall etc, as this
case include some firewall,security related configurations.

Thanks
Priya




On Fri, Dec 20, 2019 at 3:23 PM Markus Schuch  wrote:

> Hi Priya,
>
> in my experience, i would focus on the OutOfMemoryError (OOME).
> 8 Gigs can be enough, but they don't have to.
>
> At first i would check if the jvm is really getting the desired heap
> size. The dockered environment make that a little harder find find out,
> since you need to get access to the jvm metrics, e.g. via jmxremote.
> Beeing able to monitor the jvm metrics helps you with correlating the
> errors with the heap and garbage collection activity.
>
> The errors you see on postgresql jdbc driver might be very related to
> the OOME.
>
> Some question i would ask myself:
>
> Do the problems repeatingly occur only when crawling this specific
> content source or only with this specific output connection? Can you
> reproduce it outside of docker in a controlled dev environment? Or is it
> a more general problem with your manifoldcf instance?
>
> May be there are some huge files beeing crawled in your content source?
> To you have any kind of transformations configured? (e.g. content size
> limit?) You should try to see in the job's history if there are any
> patterns, like the error rises always after encountering the same
> document xy.
>
> Cheers
> Markus
>
>
>
> Am 20.12.2019 um 09:59 schrieb Priya Arora:
> > Hi  Markus ,
> >
> > Heap size defined is 8GB. Manifoldcf start-options-unix file  Xmx etc
> > parameters is defined to have memory 8192mb.
> >
> > It seems to be an issue with memory also, and also when manifoldcf tries
> > to communicate to Database. Do you explicitly define somewhere
> > connection timer when to communicate to postgres.
> > Postgres is installed as a part of docker image pull and then some
> > changes in properties.xml(of manifoldcf) to connect to database.
> > On the other hand Elastic search is also holding sufficient memory and
> > Manifoldcf is also provided with 8 cores CPU.
> >
> > Can you suggest some solution.
> >
> > Thanks
> > Priya
> >
> > On Fri, Dec 20, 2019 at 2:23 PM Markus Schuch  > > wrote:
> >
> > Hi Priya,
> >
> > your manifoldcf JVM suffers from high garbage collection pressure:
> >
> > java.lang.OutOfMemoryError: GC overhead limit exceeded
> >
> > What is your current heap size?
> > Without knowing that, i suggest to increase the heap size. (java
> > -Xmx...)
> >
> > Cheers,
> > Markus
> >
> > Am 20.12.2019 um 09:02 schrieb Priya Arora:
> > > Hi All,
> > >
> > > I am facing below error while accessing Manifoldcf. Requirement is
> to
> > > crawl data from a website using Repository as "Web" and Output
> > connector
> > > as "Elastic Search"
> > > Manifoldcf is configured inside a docker container and also
> > postgres is
> > > used a docker container.
> > > When launching manifold getting below error
> > > image.png
> > >
> > > When checked logs:-
> > > *1)sudo docker exec -it 0b872dfafc5c tail -1000
> > > /usr/share/manifoldcf/example/logs/manifoldcf.log*
> > > FATAL 2019-12-20T06:06:13,176 (Stuffer thread) - Error tossed:
> Timer
> > > already cancelled.
> > > java.lang.IllegalStateException: Timer already cancelled.
> > > at java.util.Timer.sched(Timer.java:397) ~[?:1.8.0_232]
> > > at java.util.Timer.schedule(Timer.java:193) ~[?:1.8.0_232]
> > > at
> > >
> org.postgresql.jdbc.PgConnection.addTimerTask(PgConnection.java:1113)
> > > ~[postgresql-42.1.3.jar:42.1.3]
> > > at
> > > org.postgresql.jdbc.PgStatement.startTimer(PgStatement.java:887)
> > > ~[postgresql-42.1.3.jar:42.1.3]
> > > at
> > >
> org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:427)
> > > ~[postgresql-42.1.3.jar:42.1.3]
> > > at
> > org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
> > > ~[postgresql-42.1.3.jar:42.1.3]
> > > at
> > >
> >
>  
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
> > > ~[postgresql-42.1.3.jar:42.1.3]
> > > at
> > >
> >
>  
> org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136)
> > > ~[postgresql-42.1.3.jar:42.1.3]
> > >

Re: Manifoldcf server Error

2019-12-20 Thread Markus Schuch
Hi Priya,

in my experience, i would focus on the OutOfMemoryError (OOME).
8 Gigs can be enough, but they don't have to.

At first i would check if the jvm is really getting the desired heap
size. The dockered environment make that a little harder find find out,
since you need to get access to the jvm metrics, e.g. via jmxremote.
Beeing able to monitor the jvm metrics helps you with correlating the
errors with the heap and garbage collection activity.

The errors you see on postgresql jdbc driver might be very related to
the OOME.

Some question i would ask myself:

Do the problems repeatingly occur only when crawling this specific
content source or only with this specific output connection? Can you
reproduce it outside of docker in a controlled dev environment? Or is it
a more general problem with your manifoldcf instance?

May be there are some huge files beeing crawled in your content source?
To you have any kind of transformations configured? (e.g. content size
limit?) You should try to see in the job's history if there are any
patterns, like the error rises always after encountering the same
document xy.

Cheers
Markus



Am 20.12.2019 um 09:59 schrieb Priya Arora:
> Hi  Markus ,
>
> Heap size defined is 8GB. Manifoldcf start-options-unix file  Xmx etc
> parameters is defined to have memory 8192mb.
>
> It seems to be an issue with memory also, and also when manifoldcf tries
> to communicate to Database. Do you explicitly define somewhere
> connection timer when to communicate to postgres.
> Postgres is installed as a part of docker image pull and then some
> changes in properties.xml(of manifoldcf) to connect to database.
> On the other hand Elastic search is also holding sufficient memory and
> Manifoldcf is also provided with 8 cores CPU.
>
> Can you suggest some solution.
>
> Thanks
> Priya
>
> On Fri, Dec 20, 2019 at 2:23 PM Markus Schuch  > wrote:
>
> Hi Priya,
>
> your manifoldcf JVM suffers from high garbage collection pressure:
>
>     java.lang.OutOfMemoryError: GC overhead limit exceeded
>
> What is your current heap size?
> Without knowing that, i suggest to increase the heap size. (java
> -Xmx...)
>
> Cheers,
> Markus
>
> Am 20.12.2019 um 09:02 schrieb Priya Arora:
> > Hi All,
> >
> > I am facing below error while accessing Manifoldcf. Requirement is to
> > crawl data from a website using Repository as "Web" and Output
> connector
> > as "Elastic Search"
> > Manifoldcf is configured inside a docker container and also
> postgres is
> > used a docker container.
> > When launching manifold getting below error
> > image.png
> >
> > When checked logs:-
> > *1)sudo docker exec -it 0b872dfafc5c tail -1000
> > /usr/share/manifoldcf/example/logs/manifoldcf.log*
> > FATAL 2019-12-20T06:06:13,176 (Stuffer thread) - Error tossed: Timer
> > already cancelled.
> > java.lang.IllegalStateException: Timer already cancelled.
> >         at java.util.Timer.sched(Timer.java:397) ~[?:1.8.0_232]
> >         at java.util.Timer.schedule(Timer.java:193) ~[?:1.8.0_232]
> >         at
> > org.postgresql.jdbc.PgConnection.addTimerTask(PgConnection.java:1113)
> > ~[postgresql-42.1.3.jar:42.1.3]
> >         at
> > org.postgresql.jdbc.PgStatement.startTimer(PgStatement.java:887)
> > ~[postgresql-42.1.3.jar:42.1.3]
> >         at
> > org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:427)
> > ~[postgresql-42.1.3.jar:42.1.3]
> >         at
> org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
> > ~[postgresql-42.1.3.jar:42.1.3]
> >         at
> >
> 
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
> > ~[postgresql-42.1.3.jar:42.1.3]
> >         at
> >
> 
> org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136)
> > ~[postgresql-42.1.3.jar:42.1.3]
> >         at
> > org.postgresql.jdbc.PgConnection.isValid(PgConnection.java:1311)
> > ~[postgresql-42.1.3.jar:42.1.3]
> >         at
> >
> 
> org.apache.manifoldcf.core.jdbcpool.ConnectionPool.getConnection(ConnectionPool.java:92)
> > ~[mcf-core.jar:?]
> >         at
> >
> 
> org.apache.manifoldcf.core.database.ConnectionFactory.getConnectionWithRetries(ConnectionFactory.java:126)
> > ~[mcf-core.jar:?]
> >         at
> >
> 
> org.apache.manifoldcf.core.database.ConnectionFactory.getConnection(ConnectionFactory.java:75)
> > ~[mcf-core.jar:?]
> >         at
> >
> 
> org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:797)
> > ~[mcf-core.jar:?]
> >         at
> >
> 
> org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)
> > ~[mcf-core.jar:?]
> >         at
> >
> 
> 

Re: Manifoldcf server Error

2019-12-20 Thread Priya Arora
Hi  Markus ,

Heap size defined is 8GB. Manifoldcf start-options-unix file  Xmx etc
parameters is defined to have memory 8192mb.

It seems to be an issue with memory also, and also when manifoldcf tries to
communicate to Database. Do you explicitly define somewhere connection
timer when to communicate to postgres.
Postgres is installed as a part of docker image pull and then some changes
in properties.xml(of manifoldcf) to connect to database.
On the other hand Elastic search is also holding sufficient memory and
Manifoldcf is also provided with 8 cores CPU.

Can you suggest some solution.

Thanks
Priya

On Fri, Dec 20, 2019 at 2:23 PM Markus Schuch  wrote:

> Hi Priya,
>
> your manifoldcf JVM suffers from high garbage collection pressure:
>
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>
> What is your current heap size?
> Without knowing that, i suggest to increase the heap size. (java -Xmx...)
>
> Cheers,
> Markus
>
> Am 20.12.2019 um 09:02 schrieb Priya Arora:
> > Hi All,
> >
> > I am facing below error while accessing Manifoldcf. Requirement is to
> > crawl data from a website using Repository as "Web" and Output connector
> > as "Elastic Search"
> > Manifoldcf is configured inside a docker container and also postgres is
> > used a docker container.
> > When launching manifold getting below error
> > image.png
> >
> > When checked logs:-
> > *1)sudo docker exec -it 0b872dfafc5c tail -1000
> > /usr/share/manifoldcf/example/logs/manifoldcf.log*
> > FATAL 2019-12-20T06:06:13,176 (Stuffer thread) - Error tossed: Timer
> > already cancelled.
> > java.lang.IllegalStateException: Timer already cancelled.
> > at java.util.Timer.sched(Timer.java:397) ~[?:1.8.0_232]
> > at java.util.Timer.schedule(Timer.java:193) ~[?:1.8.0_232]
> > at
> > org.postgresql.jdbc.PgConnection.addTimerTask(PgConnection.java:1113)
> > ~[postgresql-42.1.3.jar:42.1.3]
> > at
> > org.postgresql.jdbc.PgStatement.startTimer(PgStatement.java:887)
> > ~[postgresql-42.1.3.jar:42.1.3]
> > at
> > org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:427)
> > ~[postgresql-42.1.3.jar:42.1.3]
> > at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
> > ~[postgresql-42.1.3.jar:42.1.3]
> > at
> >
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
> > ~[postgresql-42.1.3.jar:42.1.3]
> > at
> >
> org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136)
> > ~[postgresql-42.1.3.jar:42.1.3]
> > at
> > org.postgresql.jdbc.PgConnection.isValid(PgConnection.java:1311)
> > ~[postgresql-42.1.3.jar:42.1.3]
> > at
> >
> org.apache.manifoldcf.core.jdbcpool.ConnectionPool.getConnection(ConnectionPool.java:92)
> > ~[mcf-core.jar:?]
> > at
> >
> org.apache.manifoldcf.core.database.ConnectionFactory.getConnectionWithRetries(ConnectionFactory.java:126)
> > ~[mcf-core.jar:?]
> > at
> >
> org.apache.manifoldcf.core.database.ConnectionFactory.getConnection(ConnectionFactory.java:75)
> > ~[mcf-core.jar:?]
> > at
> >
> org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:797)
> > ~[mcf-core.jar:?]
> > at
> >
> org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)
> > ~[mcf-core.jar:?]
> > at
> >
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
> > ~[mcf-core.jar:?]
> > at
> >
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
> > ~[mcf-core.jar:?]
> > at
> >
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)
> > ~[mcf-core.jar:?]
> > at
> >
> org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:221)
> > ~[mcf-core.jar:?]
> > at
> >
> org.apache.manifoldcf.crawler.jobs.Jobs.getActiveJobConnections(Jobs.java:736)
> > ~[mcf-pull-agent.jar:?]
> > at
> >
> org.apache.manifoldcf.crawler.jobs.JobManager.getNextDocuments(JobManager.java:2869)
> > ~[mcf-pull-agent.jar:?]
> > at
> >
> org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:186)
> > [mcf-pull-agent.jar:?]
> > *2)sudo docker logs  --tail 1000*
> > Exception in thread "PostgreSQL-JDBC-SharedTimer-1"
> > java.lang.OutOfMemoryError: GC overhead limit exceeded
> > at java.util.ArrayList.iterator(ArrayList.java:840)
> > at
> > java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
> > at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
> > at java.net.InetAddress.getAllByName(InetAddress.java:1193)
> > at java.net.InetAddress.getAllByName(InetAddress.java:1127)
> > at java.net.InetAddress.getByName(InetAddress.java:1077)
> > at java.net.InetSocketAddress.(InetSocketAddress.java:220)
> > at 

Re: Manifoldcf server Error

2019-12-20 Thread Markus Schuch
Hi Priya,

your manifoldcf JVM suffers from high garbage collection pressure:

java.lang.OutOfMemoryError: GC overhead limit exceeded

What is your current heap size?
Without knowing that, i suggest to increase the heap size. (java -Xmx...)

Cheers,
Markus

Am 20.12.2019 um 09:02 schrieb Priya Arora:
> Hi All,
>
> I am facing below error while accessing Manifoldcf. Requirement is to
> crawl data from a website using Repository as "Web" and Output connector
> as "Elastic Search"
> Manifoldcf is configured inside a docker container and also postgres is
> used a docker container.
> When launching manifold getting below error
> image.png
>
> When checked logs:-
> *1)sudo docker exec -it 0b872dfafc5c tail -1000
> /usr/share/manifoldcf/example/logs/manifoldcf.log*
> FATAL 2019-12-20T06:06:13,176 (Stuffer thread) - Error tossed: Timer
> already cancelled.
> java.lang.IllegalStateException: Timer already cancelled.
>         at java.util.Timer.sched(Timer.java:397) ~[?:1.8.0_232]
>         at java.util.Timer.schedule(Timer.java:193) ~[?:1.8.0_232]
>         at
> org.postgresql.jdbc.PgConnection.addTimerTask(PgConnection.java:1113)
> ~[postgresql-42.1.3.jar:42.1.3]
>         at
> org.postgresql.jdbc.PgStatement.startTimer(PgStatement.java:887)
> ~[postgresql-42.1.3.jar:42.1.3]
>         at
> org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:427)
> ~[postgresql-42.1.3.jar:42.1.3]
>         at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
> ~[postgresql-42.1.3.jar:42.1.3]
>         at
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
> ~[postgresql-42.1.3.jar:42.1.3]
>         at
> org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136)
> ~[postgresql-42.1.3.jar:42.1.3]
>         at
> org.postgresql.jdbc.PgConnection.isValid(PgConnection.java:1311)
> ~[postgresql-42.1.3.jar:42.1.3]
>         at
> org.apache.manifoldcf.core.jdbcpool.ConnectionPool.getConnection(ConnectionPool.java:92)
> ~[mcf-core.jar:?]
>         at
> org.apache.manifoldcf.core.database.ConnectionFactory.getConnectionWithRetries(ConnectionFactory.java:126)
> ~[mcf-core.jar:?]
>         at
> org.apache.manifoldcf.core.database.ConnectionFactory.getConnection(ConnectionFactory.java:75)
> ~[mcf-core.jar:?]
>         at
> org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:797)
> ~[mcf-core.jar:?]
>         at
> org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)
> ~[mcf-core.jar:?]
>         at
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
> ~[mcf-core.jar:?]
>         at
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
> ~[mcf-core.jar:?]
>         at
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)
> ~[mcf-core.jar:?]
>         at
> org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:221)
> ~[mcf-core.jar:?]
>         at
> org.apache.manifoldcf.crawler.jobs.Jobs.getActiveJobConnections(Jobs.java:736)
> ~[mcf-pull-agent.jar:?]
>         at
> org.apache.manifoldcf.crawler.jobs.JobManager.getNextDocuments(JobManager.java:2869)
> ~[mcf-pull-agent.jar:?]
>         at
> org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:186)
> [mcf-pull-agent.jar:?]
> *2)sudo docker logs  --tail 1000*
> Exception in thread "PostgreSQL-JDBC-SharedTimer-1"
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>         at java.util.ArrayList.iterator(ArrayList.java:840)
>         at
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
>         at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
>         at java.net.InetAddress.getAllByName(InetAddress.java:1193)
>         at java.net.InetAddress.getAllByName(InetAddress.java:1127)
>         at java.net.InetAddress.getByName(InetAddress.java:1077)
>         at java.net.InetSocketAddress.(InetSocketAddress.java:220)
>         at org.postgresql.core.PGStream.(PGStream.java:66)
>         at
> org.postgresql.core.QueryExecutorBase.sendQueryCancel(QueryExecutorBase.java:155)
>         at
> org.postgresql.jdbc.PgConnection.cancelQuery(PgConnection.java:971)
>         at org.postgresql.jdbc.PgStatement.cancel(PgStatement.java:812)
>         at org.postgresql.jdbc.PgStatement$1.run(PgStatement.java:880)
>         at java.util.TimerThread.mainLoop(Timer.java:555)
>         at java.util.TimerThread.run(Timer.java:505)
> 2019-12-19 18:09:05,848 Job start thread ERROR Unable to write to stream
> logs/manifoldcf.log for appender MyFile
> 2019-12-19 18:09:05,848 Seeding thread ERROR Unable to write to stream
> logs/manifoldcf.log for appender MyFile
> 2019-12-19 18:09:05,848 Job reset thread ERROR Unable to write to stream
> logs/manifoldcf.log for appender MyFile
> 2019-12-19 18:09:05,848 Job notification thread ERROR Unable to write to
> stream