[jira] [Commented] (CONNECTORS-1564) Support preemptive authentication to Solr connector

2019-01-28 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754330#comment-16754330
 ] 

Michael Osipov commented on CONNECTORS-1564:


Does the config work out of the box? Did you try just to set up mod_proxy and  
nothing else? Try to change as little as possible. It did for me with Tomcat. 

> Support preemptive authentication to Solr connector
> ---
>
> Key: CONNECTORS-1564
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1564
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Reporter: Erlend Garåsen
>Assignee: Karl Wright
>Priority: Major
> Attachments: CONNECTORS-1564.patch
>
>
> We should post preemptively in case the Solr server requires basic 
> authentication. This will make the communication between ManifoldCF and Solr 
> much more effective instead of the following:
>  * Send a HTTP POST request to Solr
>  * Solr sends a 401 response
>  * Send the same request, but with a "{{Authorization: Basic}}" header
> With preemptive authentication, we can send the header in the first request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1564) Support preemptive authentication to Solr connector

2019-01-28 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754141#comment-16754141
 ] 

Erlend Garåsen commented on CONNECTORS-1564:


Yes, I know, [~michael-o]. :-/

I tried to copy our existing httpd.conf file, but for some reasons I cannot 
access any web pages after Apache was started. This is what I can see in my 
error_log:

{{[Mon Jan 28 16:36:31.641341 2019] [mpm_event:notice] [pid 24718:tid 
140039187126080] AH00489: Apache/2.5.1-dev (Unix) configured -- resuming normal 
operations}}
{{[Mon Jan 28 16:36:31.646482 2019] [core:notice] [pid 24718:tid 
140039187126080] AH00094: Command line: 
'/home/erlendfg-drift/httpd/build/bin/httpd'}}

This is what I changed in my copied httpd.conf file:
 * Place the User and Group settings inside a  block
 * Removed the following: Include conf.modules.d/*.conf  
 * Added the necessary LoadModule lines.

There are probably other things which need to be changes as well, but then I 
need to read the 2.5 documentation and spend a lot more time.

> Support preemptive authentication to Solr connector
> ---
>
> Key: CONNECTORS-1564
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1564
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Reporter: Erlend Garåsen
>Assignee: Karl Wright
>Priority: Major
> Attachments: CONNECTORS-1564.patch
>
>
> We should post preemptively in case the Solr server requires basic 
> authentication. This will make the communication between ManifoldCF and Solr 
> much more effective instead of the following:
>  * Send a HTTP POST request to Solr
>  * Solr sends a 401 response
>  * Send the same request, but with a "{{Authorization: Basic}}" header
> With preemptive authentication, we can send the header in the first request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1564) Support preemptive authentication to Solr connector

2019-01-28 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754119#comment-16754119
 ] 

Michael Osipov commented on CONNECTORS-1564:


This is a common problem with RHEL/CentOS: Apache/2.4.6. Ancient software. In 
most cases, I don't rely on prepackaged software on RHEL7 because it so old 
that one cannot reasonably work with.

> Support preemptive authentication to Solr connector
> ---
>
> Key: CONNECTORS-1564
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1564
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Reporter: Erlend Garåsen
>Assignee: Karl Wright
>Priority: Major
> Attachments: CONNECTORS-1564.patch
>
>
> We should post preemptively in case the Solr server requires basic 
> authentication. This will make the communication between ManifoldCF and Solr 
> much more effective instead of the following:
>  * Send a HTTP POST request to Solr
>  * Solr sends a 401 response
>  * Send the same request, but with a "{{Authorization: Basic}}" header
> With preemptive authentication, we can send the header in the first request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CONNECTORS-1564) Support preemptive authentication to Solr connector

2019-01-28 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754096#comment-16754096
 ] 

Erlend Garåsen edited comment on CONNECTORS-1564 at 1/28/19 4:12 PM:
-

Unfortunately it wasn't possible to just replace the mod_proxy_http.so module. 
Apache crashed. So I will now configure Apache from trunk to use the same 
settings as the one in production (mod_ssl, virtual host etc.). I guess this is 
something I have to do tomorrow.

I also tried to reuse my existing httpd.conf file in my new Apache version, but 
that didn't work, which means, I need to do this when I have more time.

{{Syntax error on line 6 of /etc/httpd/conf.modules.d/00-base.conf: API module 
structure 'access_compat_module' in file 
/etc/httpd/modules/mod_access_compat.so is garbled - expected signature 
41503235 but saw 41503234 - perhaps this is not an Apache module DSO, or was 
compiled for a different Apache version?}}


was (Author: erlendfg):
Unfortunately it wasn't possible to just replace the mod_proxy_http.so module. 
Apache crashed. So I will now configure Apache from trunk to use the same 
settings as the one in production (mod_ssl, virtual host etc.). I guess this is 
something I have to do tomorrow, but I hope it's sufficient to create a symlink 
to httpd.conf.

> Support preemptive authentication to Solr connector
> ---
>
> Key: CONNECTORS-1564
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1564
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Reporter: Erlend Garåsen
>Assignee: Karl Wright
>Priority: Major
> Attachments: CONNECTORS-1564.patch
>
>
> We should post preemptively in case the Solr server requires basic 
> authentication. This will make the communication between ManifoldCF and Solr 
> much more effective instead of the following:
>  * Send a HTTP POST request to Solr
>  * Solr sends a 401 response
>  * Send the same request, but with a "{{Authorization: Basic}}" header
> With preemptive authentication, we can send the header in the first request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1564) Support preemptive authentication to Solr connector

2019-01-28 Thread Michael Osipov (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754017#comment-16754017
 ] 

Michael Osipov commented on CONNECTORS-1564:


Alright, will wait for your response/findings.

> Support preemptive authentication to Solr connector
> ---
>
> Key: CONNECTORS-1564
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1564
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Reporter: Erlend Garåsen
>Assignee: Karl Wright
>Priority: Major
> Attachments: CONNECTORS-1564.patch
>
>
> We should post preemptively in case the Solr server requires basic 
> authentication. This will make the communication between ManifoldCF and Solr 
> much more effective instead of the following:
>  * Send a HTTP POST request to Solr
>  * Solr sends a 401 response
>  * Send the same request, but with a "{{Authorization: Basic}}" header
> With preemptive authentication, we can send the header in the first request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1564) Support preemptive authentication to Solr connector

2019-01-28 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753911#comment-16753911
 ] 

Erlend Garåsen commented on CONNECTORS-1564:


[~michael-o], I finally managed to build HTTPd. Installing the development 
tools on RHEL and PCRE did the trick. I will get back to you regarding the 
tests within a couple of days.

> Support preemptive authentication to Solr connector
> ---
>
> Key: CONNECTORS-1564
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1564
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>Reporter: Erlend Garåsen
>Assignee: Karl Wright
>Priority: Major
> Attachments: CONNECTORS-1564.patch
>
>
> We should post preemptively in case the Solr server requires basic 
> authentication. This will make the communication between ManifoldCF and Solr 
> much more effective instead of the following:
>  * Send a HTTP POST request to Solr
>  * Solr sends a 401 response
>  * Send the same request, but with a "{{Authorization: Basic}}" header
> With preemptive authentication, we can send the header in the first request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1575) inconsistant use of value-labels

2019-01-28 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753908#comment-16753908
 ] 

Karl Wright commented on CONNECTORS-1575:
-

This is because there are two somewhat different internal representations 
involved.  While it is unfortunate that they appear inconsistent, there is 
nothing that can be done to change them since doing so would be backwards 
incompatible.


> inconsistant use of value-labels 
> -
>
> Key: CONNECTORS-1575
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1575
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: API
>Affects Versions: ManifoldCF 2.12
>Reporter: Tim Steenbeke
>Priority: Minor
> Attachments: image-2019-01-28-11-57-46-738.png
>
>
> When retrieving a job, using the API there seems to be inconsistencies in the 
> return JSON of a job.
> For the schedule value of 'hourofday', 'minutesofhour', etc. the label of the 
> value is 'value' while for all other value-labels it is '_value_'.
>  
> !image-2019-01-28-11-57-46-738.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1574) Performance tuning of manifold

2019-01-28 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753913#comment-16753913
 ] 

Karl Wright commented on CONNECTORS-1574:
-

If you look in the ManifoldCF log, all queries that take more than a minute to 
execute are logged, along with an EXPLAIN plan.  Could you look at your logs 
and find the queries and provide their explanation?

The quality of the query plans is usually dependent on the quality of the 
statistics that the database keeps.  When the statistics are out of date, then 
the plan sometimes gets horribly bad.  ManifoldCF *attempts* to keep up with 
this by re-analyzing tables after a fixed number of changes, but necessarily it 
cannot do better than estimate the number of changes and their effects on the 
table statistics.  So if you are experiencing problems with certain queries, 
you can set properties.xml values that increase the frequency of analyze 
operations for that table.  But first we need to know what's going wrong.


> Performance tuning of manifold
> --
>
> Key: CONNECTORS-1574
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1574
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: File system connector, JCIFS connector, Solr 6.x 
> component
>Affects Versions: ManifoldCF 2.5
> Environment: Apache manifold installed in Linux machine
> Linux version 3.10.0-327.el7.ppc64le
> Red Hat Enterprise Linux Server release 7.2 (Maipo)
>Reporter: balaji
>Assignee: Karl Wright
>Priority: Critical
>  Labels: performance
>
> My team is using *Apache ManifoldCF 2.5 with SOLR Cloud* for indexing of 
> data. we are currently having 450-500 jobs which needs to run simultaneously. 
> We need to index json data and we are using connector type as *file system* 
> along with *postgres* as backend database. 
> We are facing several issues like
> 1. Scheduling works for some jobs and doesn't work for other jobs. 
> 2. Some jobs gets completed and some jobs hangs and doesn't get completed.
> 3. With one job earlier 6 documents was getting indexed in 15minutes but 
> now even a directory path having 5 documents takes 20 minutes or sometimes 
> doesn't get completed
> 4. "list all jobs" or "status and job management" page doesn't load sometimes 
> and on seeing the pg_stat_activity we observe that 2 queries are in waiting 
> state state because of which the page doesn't load. so if we kill those 
> queries or restart manifold the issue gets resolved and the page loads 
> properly
> queries getting stuck:
> 1. SELECT ID,FAILTIME, FAILCOUNT, SEEDINGVERSION, STATUS FROM JOBS WHERE 
> (STATUS=$1 OR STATUS=$2) FOR UPDATE
> 2. UPDATE JOBS SET ERRORTEXT=NULL, ENDTIME=NULL, WINDOWEND=NULL, STATUS=$1 
> WHERE ID=$2
> note : We have deployed manifold in *linux*. Our major requirement is 
> scheduling of jobs which will run every 15 minutes
> Please help us in fine tuning manifold so that it runs smoothly and acts as a 
> robust system.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1575) inconsistant use of value-labels

2019-01-28 Thread Tim Steenbeke (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753912#comment-16753912
 ] 

Tim Steenbeke commented on CONNECTORS-1575:
---

Ok, thank you for your fast response.

> inconsistant use of value-labels 
> -
>
> Key: CONNECTORS-1575
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1575
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: API
>Affects Versions: ManifoldCF 2.12
>Reporter: Tim Steenbeke
>Priority: Minor
> Attachments: image-2019-01-28-11-57-46-738.png
>
>
> When retrieving a job, using the API there seems to be inconsistencies in the 
> return JSON of a job.
> For the schedule value of 'hourofday', 'minutesofhour', etc. the label of the 
> value is 'value' while for all other value-labels it is '_value_'.
>  
> !image-2019-01-28-11-57-46-738.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CONNECTORS-1574) Performance tuning of manifold

2019-01-28 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1574:
---

Assignee: Karl Wright

> Performance tuning of manifold
> --
>
> Key: CONNECTORS-1574
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1574
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: File system connector, JCIFS connector, Solr 6.x 
> component
>Affects Versions: ManifoldCF 2.5
> Environment: Apache manifold installed in Linux machine
> Linux version 3.10.0-327.el7.ppc64le
> Red Hat Enterprise Linux Server release 7.2 (Maipo)
>Reporter: balaji
>Assignee: Karl Wright
>Priority: Critical
>  Labels: performance
>
> My team is using *Apache ManifoldCF 2.5 with SOLR Cloud* for indexing of 
> data. we are currently having 450-500 jobs which needs to run simultaneously. 
> We need to index json data and we are using connector type as *file system* 
> along with *postgres* as backend database. 
> We are facing several issues like
> 1. Scheduling works for some jobs and doesn't work for other jobs. 
> 2. Some jobs gets completed and some jobs hangs and doesn't get completed.
> 3. With one job earlier 6 documents was getting indexed in 15minutes but 
> now even a directory path having 5 documents takes 20 minutes or sometimes 
> doesn't get completed
> 4. "list all jobs" or "status and job management" page doesn't load sometimes 
> and on seeing the pg_stat_activity we observe that 2 queries are in waiting 
> state state because of which the page doesn't load. so if we kill those 
> queries or restart manifold the issue gets resolved and the page loads 
> properly
> queries getting stuck:
> 1. SELECT ID,FAILTIME, FAILCOUNT, SEEDINGVERSION, STATUS FROM JOBS WHERE 
> (STATUS=$1 OR STATUS=$2) FOR UPDATE
> 2. UPDATE JOBS SET ERRORTEXT=NULL, ENDTIME=NULL, WINDOWEND=NULL, STATUS=$1 
> WHERE ID=$2
> note : We have deployed manifold in *linux*. Our major requirement is 
> scheduling of jobs which will run every 15 minutes
> Please help us in fine tuning manifold so that it runs smoothly and acts as a 
> robust system.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1575) inconsistant use of value-labels

2019-01-28 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1575.
-
Resolution: Won't Fix

> inconsistant use of value-labels 
> -
>
> Key: CONNECTORS-1575
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1575
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: API
>Affects Versions: ManifoldCF 2.12
>Reporter: Tim Steenbeke
>Priority: Minor
> Attachments: image-2019-01-28-11-57-46-738.png
>
>
> When retrieving a job, using the API there seems to be inconsistencies in the 
> return JSON of a job.
> For the schedule value of 'hourofday', 'minutesofhour', etc. the label of the 
> value is 'value' while for all other value-labels it is '_value_'.
>  
> !image-2019-01-28-11-57-46-738.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CONNECTORS-1575) inconsistant use of value-labels

2019-01-28 Thread Tim Steenbeke (JIRA)
Tim Steenbeke created CONNECTORS-1575:
-

 Summary: inconsistant use of value-labels 
 Key: CONNECTORS-1575
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1575
 Project: ManifoldCF
  Issue Type: Bug
  Components: API
Affects Versions: ManifoldCF 2.12
Reporter: Tim Steenbeke
 Attachments: image-2019-01-28-11-57-46-738.png

When retrieving a job, using the API there seems to be inconsistencies in the 
return JSON of a job.

For the schedule value of 'hourofday', 'minutesofhour', etc. the label of the 
value is 'value' while for all other value-labels it is '_value_'.

 

!image-2019-01-28-11-57-46-738.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CONNECTORS-1574) Performance tuning of manifold

2019-01-28 Thread balaji (JIRA)
balaji created CONNECTORS-1574:
--

 Summary: Performance tuning of manifold
 Key: CONNECTORS-1574
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1574
 Project: ManifoldCF
  Issue Type: Bug
  Components: File system connector, JCIFS connector, Solr 6.x component
Affects Versions: ManifoldCF 2.5
 Environment: Apache manifold installed in Linux machine

Linux version 3.10.0-327.el7.ppc64le

Red Hat Enterprise Linux Server release 7.2 (Maipo)
Reporter: balaji


My team is using *Apache ManifoldCF 2.5 with SOLR Cloud* for indexing of data. 
we are currently having 450-500 jobs which needs to run simultaneously. We need 
to index json data and we are using connector type as *file system* along with 
*postgres* as backend database. 

We are facing several issues like
1. Scheduling works for some jobs and doesn't work for other jobs. 
2. Some jobs gets completed and some jobs hangs and doesn't get completed.
3. With one job earlier 6 documents was getting indexed in 15minutes but 
now even a directory path having 5 documents takes 20 minutes or sometimes 
doesn't get completed
4. "list all jobs" or "status and job management" page doesn't load sometimes 
and on seeing the pg_stat_activity we observe that 2 queries are in waiting 
state state because of which the page doesn't load. so if we kill those queries 
or restart manifold the issue gets resolved and the page loads properly
queries getting stuck:
1. SELECT ID,FAILTIME, FAILCOUNT, SEEDINGVERSION, STATUS FROM JOBS WHERE 
(STATUS=$1 OR STATUS=$2) FOR UPDATE
2. UPDATE JOBS SET ERRORTEXT=NULL, ENDTIME=NULL, WINDOWEND=NULL, STATUS=$1 
WHERE ID=$2

note : We have deployed manifold in *linux*. Our major requirement is 
scheduling of jobs which will run every 15 minutes

Please help us in fine tuning manifold so that it runs smoothly and acts as a 
robust system.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)