[jira] [Commented] (TIKA-1196) JAX-RS server only responds to queries to/from http://localhost

2013-11-17 Thread Rian Stockbower (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824939#comment-13824939
 ] 

Rian Stockbower commented on TIKA-1196:
---

Unfortunately that didn't work. I've just emailed the CXF user list.

 JAX-RS server only responds to queries to/from http://localhost
 ---

 Key: TIKA-1196
 URL: https://issues.apache.org/jira/browse/TIKA-1196
 Project: Tika
  Issue Type: Bug
  Components: server
Affects Versions: 1.4
 Environment: Mac OS X, Windows Server 2008
Reporter: Rian Stockbower
Priority: Minor
  Labels: JAXRS, hostname, web-service

 I'm not sure if this is a problem with the Tika JAX-RS server, or with how it 
 uses CXF under the hood. Anyway:
 I have a large text extraction job (10-15 million documents) that I'm using 
 the web service for. It would be nice to be able to distribute this 
 horizontally across multiple nodes to speed up the processing. I had thought 
 to have a job queue with a couple consumers, farming out PUT requests across 
 several Tika web service endpoints.
 But the JAX-RS web service will only respond to queries made to 
 {{http://localhost:9998/tika}}.
 I can't call {{http://hostname:9998/tika}} -- even if it's still a local 
 operation.
 Here is a list of things I've tried:
 * I changed line 89 of TikaServerCLI.java to compute the name of the host at 
 runtime. No go: the server starts up, and immediately terminates.
 * I changed line 89 of TikaServerCLI.java to be a hostname (not a FQDN), and 
 re-compiled:
 ** {{mvn compile -rf :tika-server}} compiles successfully. Start up the 
 server, and it terminates, just like when I tried to compute the hostname at 
 runtime
 ** {{mvn install}} from the topmost Tika directory gets the service 
 responding to both {{http://hostname:9998/tika}} and 
 {{http://hostname.domain.net:9998/tika}} (Seemed weird, this is why I was 
 thinking it was further up the chain in CXF?)
 In a perfect world:
 # The server should respond to any valid calls that make sense:
 #* 127.0.0.1
 #* localhost
 #* hostname
 #* host.domain.tld
 #* ip_address
 # A {{hostname}} invocation parameter could be used to limit what the service 
 responds to when it's started up. (A very optional, nice-to-have.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (TIKA-1196) JAX-RS server only responds to queries to/from http://localhost

2013-11-17 Thread Sergey Beryozkin (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824948#comment-13824948
 ] 

Sergey Beryozkin commented on TIKA-1196:


Hi. A number of fixes have been applied to CXF code dealing with the HTTP host 
resolution across multiple releases. 
I think users sometimes use 0.0.0.0 instead of the host name or simply use 
the relative address. Rian, can you please try CXF 2.7.7 ?  
Cheers, Sergey

 JAX-RS server only responds to queries to/from http://localhost
 ---

 Key: TIKA-1196
 URL: https://issues.apache.org/jira/browse/TIKA-1196
 Project: Tika
  Issue Type: Bug
  Components: server
Affects Versions: 1.4
 Environment: Mac OS X, Windows Server 2008
Reporter: Rian Stockbower
Priority: Minor
  Labels: JAXRS, hostname, web-service

 I'm not sure if this is a problem with the Tika JAX-RS server, or with how it 
 uses CXF under the hood. Anyway:
 I have a large text extraction job (10-15 million documents) that I'm using 
 the web service for. It would be nice to be able to distribute this 
 horizontally across multiple nodes to speed up the processing. I had thought 
 to have a job queue with a couple consumers, farming out PUT requests across 
 several Tika web service endpoints.
 But the JAX-RS web service will only respond to queries made to 
 {{http://localhost:9998/tika}}.
 I can't call {{http://hostname:9998/tika}} -- even if it's still a local 
 operation.
 Here is a list of things I've tried:
 * I changed line 89 of TikaServerCLI.java to compute the name of the host at 
 runtime. No go: the server starts up, and immediately terminates.
 * I changed line 89 of TikaServerCLI.java to be a hostname (not a FQDN), and 
 re-compiled:
 ** {{mvn compile -rf :tika-server}} compiles successfully. Start up the 
 server, and it terminates, just like when I tried to compute the hostname at 
 runtime
 ** {{mvn install}} from the topmost Tika directory gets the service 
 responding to both {{http://hostname:9998/tika}} and 
 {{http://hostname.domain.net:9998/tika}} (Seemed weird, this is why I was 
 thinking it was further up the chain in CXF?)
 In a perfect world:
 # The server should respond to any valid calls that make sense:
 #* 127.0.0.1
 #* localhost
 #* hostname
 #* host.domain.tld
 #* ip_address
 # A {{hostname}} invocation parameter could be used to limit what the service 
 responds to when it's started up. (A very optional, nice-to-have.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (TIKA-1197) Update CXF dependency in Tika Server to CXF 2.7.7 or CXF 2.7.8

2013-11-17 Thread Sergey Beryozkin (JIRA)
Sergey Beryozkin created TIKA-1197:
--

 Summary: Update CXF dependency in Tika Server to CXF 2.7.7 or CXF 
2.7.8
 Key: TIKA-1197
 URL: https://issues.apache.org/jira/browse/TIKA-1197
 Project: Tika
  Issue Type: Task
  Components: server
Reporter: Sergey Beryozkin


Server modules depends on CXF 2.6.1 which is a very old version. Many fixes, 
improvements and new features have been introduced into CXF 2.7.x and 
3.0.0-SNAPSHOT.

Proposal: update the server to CXF 2.7.7 or possibly CXF 2.7.8 due shortly.
CXF 2.7.x implements JAX-RS 2.0 m10 and supports 1.1 applications.

We can move to CXF 3.0.0 (RC is due very shortly) which implements JAX-RS 2.0 
completely.

I can look at creating a patch after CXF 2.7.8 is released 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (TIKA-1198) Consider optionally utilizing CXF JAX-RS Attachment support

2013-11-17 Thread Sergey Beryozkin (JIRA)
Sergey Beryozkin created TIKA-1198:
--

 Summary: Consider optionally utilizing CXF JAX-RS Attachment 
support
 Key: TIKA-1198
 URL: https://issues.apache.org/jira/browse/TIKA-1198
 Project: Tika
  Issue Type: Wish
  Components: server
Reporter: Sergey Beryozkin
Priority: Minor


CXF offers a fairly extensive support for multiparts:
http://cxf.apache.org/docs/jax-rs-multiparts.html

Perhaps some of that can help with the server offering more options to do with 
uploading/downloading files



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (TIKA-1196) JAX-RS server only responds to queries to/from http://localhost

2013-11-17 Thread Rian Stockbower (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824956#comment-13824956
 ] 

Rian Stockbower commented on TIKA-1196:
---

That worked, Sergey. Changing localhost to 0.0.0.0 now lets me hit the service 
using any valid address.

 JAX-RS server only responds to queries to/from http://localhost
 ---

 Key: TIKA-1196
 URL: https://issues.apache.org/jira/browse/TIKA-1196
 Project: Tika
  Issue Type: Bug
  Components: server
Affects Versions: 1.4
 Environment: Mac OS X, Windows Server 2008
Reporter: Rian Stockbower
Priority: Minor
  Labels: JAXRS, hostname, web-service

 I'm not sure if this is a problem with the Tika JAX-RS server, or with how it 
 uses CXF under the hood. Anyway:
 I have a large text extraction job (10-15 million documents) that I'm using 
 the web service for. It would be nice to be able to distribute this 
 horizontally across multiple nodes to speed up the processing. I had thought 
 to have a job queue with a couple consumers, farming out PUT requests across 
 several Tika web service endpoints.
 But the JAX-RS web service will only respond to queries made to 
 {{http://localhost:9998/tika}}.
 I can't call {{http://hostname:9998/tika}} -- even if it's still a local 
 operation.
 Here is a list of things I've tried:
 * I changed line 89 of TikaServerCLI.java to compute the name of the host at 
 runtime. No go: the server starts up, and immediately terminates.
 * I changed line 89 of TikaServerCLI.java to be a hostname (not a FQDN), and 
 re-compiled:
 ** {{mvn compile -rf :tika-server}} compiles successfully. Start up the 
 server, and it terminates, just like when I tried to compute the hostname at 
 runtime
 ** {{mvn install}} from the topmost Tika directory gets the service 
 responding to both {{http://hostname:9998/tika}} and 
 {{http://hostname.domain.net:9998/tika}} (Seemed weird, this is why I was 
 thinking it was further up the chain in CXF?)
 In a perfect world:
 # The server should respond to any valid calls that make sense:
 #* 127.0.0.1
 #* localhost
 #* hostname
 #* host.domain.tld
 #* ip_address
 # A {{hostname}} invocation parameter could be used to limit what the service 
 responds to when it's started up. (A very optional, nice-to-have.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (TIKA-1196) JAX-RS server only responds to queries to/from http://localhost

2013-11-17 Thread Rian Stockbower (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rian Stockbower updated TIKA-1196:
--

Attachment: tika-1196.patch

I've attached a patch file that just changes localhost to 0.0.0.0, which allows 
users to hit the endpoint using any valid IP or hostname.

Attempting to move the JAX-RS server to CXF 2.7.8 is a little beyond my skill.

 JAX-RS server only responds to queries to/from http://localhost
 ---

 Key: TIKA-1196
 URL: https://issues.apache.org/jira/browse/TIKA-1196
 Project: Tika
  Issue Type: Bug
  Components: server
Affects Versions: 1.4
 Environment: Mac OS X, Windows Server 2008
Reporter: Rian Stockbower
Priority: Minor
  Labels: JAXRS, hostname, web-service
 Attachments: tika-1196.patch


 I'm not sure if this is a problem with the Tika JAX-RS server, or with how it 
 uses CXF under the hood. Anyway:
 I have a large text extraction job (10-15 million documents) that I'm using 
 the web service for. It would be nice to be able to distribute this 
 horizontally across multiple nodes to speed up the processing. I had thought 
 to have a job queue with a couple consumers, farming out PUT requests across 
 several Tika web service endpoints.
 But the JAX-RS web service will only respond to queries made to 
 {{http://localhost:9998/tika}}.
 I can't call {{http://hostname:9998/tika}} -- even if it's still a local 
 operation.
 Here is a list of things I've tried:
 * I changed line 89 of TikaServerCLI.java to compute the name of the host at 
 runtime. No go: the server starts up, and immediately terminates.
 * I changed line 89 of TikaServerCLI.java to be a hostname (not a FQDN), and 
 re-compiled:
 ** {{mvn compile -rf :tika-server}} compiles successfully. Start up the 
 server, and it terminates, just like when I tried to compute the hostname at 
 runtime
 ** {{mvn install}} from the topmost Tika directory gets the service 
 responding to both {{http://hostname:9998/tika}} and 
 {{http://hostname.domain.net:9998/tika}} (Seemed weird, this is why I was 
 thinking it was further up the chain in CXF?)
 In a perfect world:
 # The server should respond to any valid calls that make sense:
 #* 127.0.0.1
 #* localhost
 #* hostname
 #* host.domain.tld
 #* ip_address
 # A {{hostname}} invocation parameter could be used to limit what the service 
 responds to when it's started up. (A very optional, nice-to-have.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (TIKA-1196) JAX-RS server only responds to queries to/from http://localhost

2013-11-17 Thread Nick Burch (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824978#comment-13824978
 ] 

Nick Burch commented on TIKA-1196:
--

I'm not sure if that should be an option, or if it's OK to change the default?

 JAX-RS server only responds to queries to/from http://localhost
 ---

 Key: TIKA-1196
 URL: https://issues.apache.org/jira/browse/TIKA-1196
 Project: Tika
  Issue Type: Bug
  Components: server
Affects Versions: 1.4
 Environment: Mac OS X, Windows Server 2008
Reporter: Rian Stockbower
Priority: Minor
  Labels: JAXRS, hostname, web-service
 Attachments: tika-1196.patch


 I'm not sure if this is a problem with the Tika JAX-RS server, or with how it 
 uses CXF under the hood. Anyway:
 I have a large text extraction job (10-15 million documents) that I'm using 
 the web service for. It would be nice to be able to distribute this 
 horizontally across multiple nodes to speed up the processing. I had thought 
 to have a job queue with a couple consumers, farming out PUT requests across 
 several Tika web service endpoints.
 But the JAX-RS web service will only respond to queries made to 
 {{http://localhost:9998/tika}}.
 I can't call {{http://hostname:9998/tika}} -- even if it's still a local 
 operation.
 Here is a list of things I've tried:
 * I changed line 89 of TikaServerCLI.java to compute the name of the host at 
 runtime. No go: the server starts up, and immediately terminates.
 * I changed line 89 of TikaServerCLI.java to be a hostname (not a FQDN), and 
 re-compiled:
 ** {{mvn compile -rf :tika-server}} compiles successfully. Start up the 
 server, and it terminates, just like when I tried to compute the hostname at 
 runtime
 ** {{mvn install}} from the topmost Tika directory gets the service 
 responding to both {{http://hostname:9998/tika}} and 
 {{http://hostname.domain.net:9998/tika}} (Seemed weird, this is why I was 
 thinking it was further up the chain in CXF?)
 In a perfect world:
 # The server should respond to any valid calls that make sense:
 #* 127.0.0.1
 #* localhost
 #* hostname
 #* host.domain.tld
 #* ip_address
 # A {{hostname}} invocation parameter could be used to limit what the service 
 responds to when it's started up. (A very optional, nice-to-have.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (TIKA-1196) JAX-RS server only responds to queries to/from http://localhost

2013-11-17 Thread Rian Stockbower (JIRA)

[ 
https://issues.apache.org/jira/browse/TIKA-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824982#comment-13824982
 ] 

Rian Stockbower commented on TIKA-1196:
---

It seems weird to restrict access to the endpoint to only loopback addresses.

That said, I'm working on something a little more interesting/robust.

 JAX-RS server only responds to queries to/from http://localhost
 ---

 Key: TIKA-1196
 URL: https://issues.apache.org/jira/browse/TIKA-1196
 Project: Tika
  Issue Type: Bug
  Components: server
Affects Versions: 1.4
 Environment: Mac OS X, Windows Server 2008
Reporter: Rian Stockbower
Priority: Minor
  Labels: JAXRS, hostname, web-service
 Attachments: tika-1196.patch


 I'm not sure if this is a problem with the Tika JAX-RS server, or with how it 
 uses CXF under the hood. Anyway:
 I have a large text extraction job (10-15 million documents) that I'm using 
 the web service for. It would be nice to be able to distribute this 
 horizontally across multiple nodes to speed up the processing. I had thought 
 to have a job queue with a couple consumers, farming out PUT requests across 
 several Tika web service endpoints.
 But the JAX-RS web service will only respond to queries made to 
 {{http://localhost:9998/tika}}.
 I can't call {{http://hostname:9998/tika}} -- even if it's still a local 
 operation.
 Here is a list of things I've tried:
 * I changed line 89 of TikaServerCLI.java to compute the name of the host at 
 runtime. No go: the server starts up, and immediately terminates.
 * I changed line 89 of TikaServerCLI.java to be a hostname (not a FQDN), and 
 re-compiled:
 ** {{mvn compile -rf :tika-server}} compiles successfully. Start up the 
 server, and it terminates, just like when I tried to compute the hostname at 
 runtime
 ** {{mvn install}} from the topmost Tika directory gets the service 
 responding to both {{http://hostname:9998/tika}} and 
 {{http://hostname.domain.net:9998/tika}} (Seemed weird, this is why I was 
 thinking it was further up the chain in CXF?)
 In a perfect world:
 # The server should respond to any valid calls that make sense:
 #* 127.0.0.1
 #* localhost
 #* hostname
 #* host.domain.tld
 #* ip_address
 # A {{hostname}} invocation parameter could be used to limit what the service 
 responds to when it's started up. (A very optional, nice-to-have.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (TIKA-1196) JAX-RS server only responds to queries to/from http://localhost

2013-11-17 Thread Rian Stockbower (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rian Stockbower updated TIKA-1196:
--

Attachment: tika-1196b.patch

Disregard my first patch. This one changes the default behavior to make the 
service respond to any valid hostname/ip address. It also adds a CLI parameter 
to control the address with instructions for the user on how to restrict usage 
to only loopback addresses.

 JAX-RS server only responds to queries to/from http://localhost
 ---

 Key: TIKA-1196
 URL: https://issues.apache.org/jira/browse/TIKA-1196
 Project: Tika
  Issue Type: Bug
  Components: server
Affects Versions: 1.4
 Environment: Mac OS X, Windows Server 2008
Reporter: Rian Stockbower
Priority: Minor
  Labels: JAXRS, hostname, web-service
 Attachments: tika-1196.patch, tika-1196b.patch


 I'm not sure if this is a problem with the Tika JAX-RS server, or with how it 
 uses CXF under the hood. Anyway:
 I have a large text extraction job (10-15 million documents) that I'm using 
 the web service for. It would be nice to be able to distribute this 
 horizontally across multiple nodes to speed up the processing. I had thought 
 to have a job queue with a couple consumers, farming out PUT requests across 
 several Tika web service endpoints.
 But the JAX-RS web service will only respond to queries made to 
 {{http://localhost:9998/tika}}.
 I can't call {{http://hostname:9998/tika}} -- even if it's still a local 
 operation.
 Here is a list of things I've tried:
 * I changed line 89 of TikaServerCLI.java to compute the name of the host at 
 runtime. No go: the server starts up, and immediately terminates.
 * I changed line 89 of TikaServerCLI.java to be a hostname (not a FQDN), and 
 re-compiled:
 ** {{mvn compile -rf :tika-server}} compiles successfully. Start up the 
 server, and it terminates, just like when I tried to compute the hostname at 
 runtime
 ** {{mvn install}} from the topmost Tika directory gets the service 
 responding to both {{http://hostname:9998/tika}} and 
 {{http://hostname.domain.net:9998/tika}} (Seemed weird, this is why I was 
 thinking it was further up the chain in CXF?)
 In a perfect world:
 # The server should respond to any valid calls that make sense:
 #* 127.0.0.1
 #* localhost
 #* hostname
 #* host.domain.tld
 #* ip_address
 # A {{hostname}} invocation parameter could be used to limit what the service 
 responds to when it's started up. (A very optional, nice-to-have.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (TIKA-1196) JAX-RS server only responds to queries to/from http://localhost

2013-11-17 Thread Rian Stockbower (JIRA)

 [ 
https://issues.apache.org/jira/browse/TIKA-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rian Stockbower updated TIKA-1196:
--

Attachment: tika-1196c.patch

Patch C fixes a careless error where the default port was always used, 
regardless of what was specified by the user.

 JAX-RS server only responds to queries to/from http://localhost
 ---

 Key: TIKA-1196
 URL: https://issues.apache.org/jira/browse/TIKA-1196
 Project: Tika
  Issue Type: Bug
  Components: server
Affects Versions: 1.4
 Environment: Mac OS X, Windows Server 2008
Reporter: Rian Stockbower
Priority: Minor
  Labels: JAXRS, hostname, web-service
 Attachments: tika-1196.patch, tika-1196b.patch, tika-1196c.patch


 I'm not sure if this is a problem with the Tika JAX-RS server, or with how it 
 uses CXF under the hood. Anyway:
 I have a large text extraction job (10-15 million documents) that I'm using 
 the web service for. It would be nice to be able to distribute this 
 horizontally across multiple nodes to speed up the processing. I had thought 
 to have a job queue with a couple consumers, farming out PUT requests across 
 several Tika web service endpoints.
 But the JAX-RS web service will only respond to queries made to 
 {{http://localhost:9998/tika}}.
 I can't call {{http://hostname:9998/tika}} -- even if it's still a local 
 operation.
 Here is a list of things I've tried:
 * I changed line 89 of TikaServerCLI.java to compute the name of the host at 
 runtime. No go: the server starts up, and immediately terminates.
 * I changed line 89 of TikaServerCLI.java to be a hostname (not a FQDN), and 
 re-compiled:
 ** {{mvn compile -rf :tika-server}} compiles successfully. Start up the 
 server, and it terminates, just like when I tried to compute the hostname at 
 runtime
 ** {{mvn install}} from the topmost Tika directory gets the service 
 responding to both {{http://hostname:9998/tika}} and 
 {{http://hostname.domain.net:9998/tika}} (Seemed weird, this is why I was 
 thinking it was further up the chain in CXF?)
 In a perfect world:
 # The server should respond to any valid calls that make sense:
 #* 127.0.0.1
 #* localhost
 #* hostname
 #* host.domain.tld
 #* ip_address
 # A {{hostname}} invocation parameter could be used to limit what the service 
 responds to when it's started up. (A very optional, nice-to-have.)



--
This message was sent by Atlassian JIRA
(v6.1#6144)