Re: Cannot find OracleDriver
type: JDBC Authority: None Database Type: ORACLE Database and Host: 21:16:18:145:1521 Instance/Database: main User Name: Password: X On Sun, Feb 26, 2012 at 2:48 PM, Karl Wright daddy...@gmail.com wrote: I haven't seen this one. I'd love to know what the connect descriptor it refers to is. Can you tell me what the parameters all look like for the JDBC connection you are setting up? Are you specifying, for instance, the port as part of the server name? Karl On Sat, Feb 25, 2012 at 1:22 PM, Matthew Parker mpar...@apogeeintegration.com wrote: Karl, That fixed the driver issue. I just updated my start.jar file by hand for now. The problem I have now is connecting to ORACLE. I can do it through NetBeans on my machine, but I cannot connect through ManfoldCF with the same settings. I get the following error: Error getting connection. Listener refused the connection with the following error. ORA-12514. TNS:Listener does not currently know of service requested in connect descriptor. This might be more of an ORACLE issue than Manifold issue, but I was wondering whether you've encountered the same thing during testing? Regards, Matt On Fri, Jan 20, 2012 at 10:28 AM, Matthew Parker mpar...@apogeeintegration.com wrote: Thanks Karl. On Thu, Jan 19, 2012 at 9:44 PM, Karl Wright daddy...@gmail.com wrote: The problem has been fixed on trunk. Basically, the instructions changed as did some of the build files. It turned out to be extremely challenging to get JDBC drivers to run when they were loaded by anything other than the system classloader, so that's what I was forced to insure. Thanks, Karl On Thu, Jan 19, 2012 at 3:33 PM, Karl Wright daddy...@gmail.com wrote: The ticket for this problem is CONNECTORS-390. Karl On Thu, Jan 19, 2012 at 3:05 PM, Matthew Parker mpar...@apogeeintegration.com wrote: Many thanks. I'll give that a try. On Thu, Jan 19, 2012 at 3:01 PM, Karl Wright daddy...@gmail.com wrote: The problem is that the JDBC driver is using a pool driver that is in common with the core of ManifoldCF. So the connector-lib path, which only the connectors know about, won't do. That's a bug which I'll create a ticket for. A temporary fix, which is slightly involved, requires you to put the ojdbc6.jar in the example/lib area, as you already tried, but in addition you will need to explicitly include the jar in your classpath. Normally the start.jar's manifest describes all the jars in the initial classpath. I thought it was possible to also include additional classpath info through the normal --classpath mechanism, but that doesn't seem to work, so you may be stuck with modifying the root build.xml file to add the jar to the manifest. I'm going to experiment a bit and see if I can come up with something quickly. Karl On Thu, Jan 19, 2012 at 2:48 PM, Karl Wright daddy...@gmail.com wrote: I was able to reproduce the problem. I'll get back to you when I figure out what the issue is. Karl On Thu, Jan 19, 2012 at 2:47 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I've used the jar file in NetBeans to connect to the database without any issue. Seems more like a class loader issue. On Thu, Jan 19, 2012 at 2:41 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I have the latest release from the Apache Manifold site (i.e. 0.3-incubating). I checked the driver jar file with winzip, and the driver name is still the same (oracle.jdbc.OracleDriver). I'm running java 1.6.0_18-b7 on Windows XP SP 3. On Thu, Jan 19, 2012 at 2:27 PM, Karl Wright daddy...@gmail.com wrote: MCF's Oracle support was written against earlier versions of the Oracle driver. It is possible that they have changed the driver class. If the driver winds up in the dist/connector-lib directory (I'm assuming you are using trunk or 0.4-incubating), then it should be accessible. Could you please try the following: jar -tf ojdbc6.jar | grep oracle/jdbc/OracleDriver ... assuming you are using Linux? If the driver class IS found, then the other possibility is that the jar is compiled against a later version of Java than the one you are using to run MCF. Please let me know what you find. Karl On Thu, Jan 19, 2012 at 1:43 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I downloaded MCF and started playing with the default setup under Jetty and Derby. It starts up without any issue. I would like to connect to our ORACLE database and import data into SOLR. I placed the ojdbc6.jar file in the connectors/jdbc/jdbc-drivers directory as stated
Re: Transforming Manifold Metadata Prior to Pushing the Data into SOLR
Please see my response interleaved below. On Mon, Feb 27, 2012 at 9:53 AM, Matthew Parker mpar...@apogeeintegration.com wrote: I'm trying to push data into SOLR.. Is there a way to transform the metadata coming in from different data sources like SharePoint, and the File Share, prior to posting it into SOLR? In general, ManifoldCF does not have data transformation abilities. With Solr, we rely on Solr Cell, which is a pipeline built on Tika, to extract content from documents and to perform transformations to document metadata etc. It is possible that at some point it will be possible to do more transformations in ManifoldCF in order to support search engines that don't have a pipeline, but that is currently not available. For instance, documents have metadata specifying their file path. I need to transform that to a URL I can use within SOLR to retrieve that document through a servlet that I wrote. The ManifoldCF model is that a connector creates a URL for each document that it indexes, using whatever makes sense for that particular repository to get you back to the document in question. So, for instance, Documentum documents will use URLs that point at Documentum's Webtop web application. It would be helpful to understand more precisely what you are trying to do. You could, for instance, modify your servlet to redirect to the ManifoldCF-generated URL. It gets indexed into Solr as the id field. Also, based on specific metadata that I'm seeing in the documents, I might want to conditionally add populate other fields in SOLR index. That sounds like a job for the Tika pipeline to me. Thanks, Karl -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.
Re: Cannot find OracleDriver
So if the Database and Host field really is 21:16:18:145:1521, try 21.16.18.145:1521 instead. ;-) Karl On Mon, Feb 27, 2012 at 9:22 AM, Matthew Parker mpar...@apogeeintegration.com wrote: type: JDBC Authority: None Database Type: ORACLE Database and Host: 21:16:18:145:1521 Instance/Database: main User Name: Password: X On Sun, Feb 26, 2012 at 2:48 PM, Karl Wright daddy...@gmail.com wrote: I haven't seen this one. I'd love to know what the connect descriptor it refers to is. Can you tell me what the parameters all look like for the JDBC connection you are setting up? Are you specifying, for instance, the port as part of the server name? Karl On Sat, Feb 25, 2012 at 1:22 PM, Matthew Parker mpar...@apogeeintegration.com wrote: Karl, That fixed the driver issue. I just updated my start.jar file by hand for now. The problem I have now is connecting to ORACLE. I can do it through NetBeans on my machine, but I cannot connect through ManfoldCF with the same settings. I get the following error: Error getting connection. Listener refused the connection with the following error. ORA-12514. TNS:Listener does not currently know of service requested in connect descriptor. This might be more of an ORACLE issue than Manifold issue, but I was wondering whether you've encountered the same thing during testing? Regards, Matt On Fri, Jan 20, 2012 at 10:28 AM, Matthew Parker mpar...@apogeeintegration.com wrote: Thanks Karl. On Thu, Jan 19, 2012 at 9:44 PM, Karl Wright daddy...@gmail.com wrote: The problem has been fixed on trunk. Basically, the instructions changed as did some of the build files. It turned out to be extremely challenging to get JDBC drivers to run when they were loaded by anything other than the system classloader, so that's what I was forced to insure. Thanks, Karl On Thu, Jan 19, 2012 at 3:33 PM, Karl Wright daddy...@gmail.com wrote: The ticket for this problem is CONNECTORS-390. Karl On Thu, Jan 19, 2012 at 3:05 PM, Matthew Parker mpar...@apogeeintegration.com wrote: Many thanks. I'll give that a try. On Thu, Jan 19, 2012 at 3:01 PM, Karl Wright daddy...@gmail.com wrote: The problem is that the JDBC driver is using a pool driver that is in common with the core of ManifoldCF. So the connector-lib path, which only the connectors know about, won't do. That's a bug which I'll create a ticket for. A temporary fix, which is slightly involved, requires you to put the ojdbc6.jar in the example/lib area, as you already tried, but in addition you will need to explicitly include the jar in your classpath. Normally the start.jar's manifest describes all the jars in the initial classpath. I thought it was possible to also include additional classpath info through the normal --classpath mechanism, but that doesn't seem to work, so you may be stuck with modifying the root build.xml file to add the jar to the manifest. I'm going to experiment a bit and see if I can come up with something quickly. Karl On Thu, Jan 19, 2012 at 2:48 PM, Karl Wright daddy...@gmail.com wrote: I was able to reproduce the problem. I'll get back to you when I figure out what the issue is. Karl On Thu, Jan 19, 2012 at 2:47 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I've used the jar file in NetBeans to connect to the database without any issue. Seems more like a class loader issue. On Thu, Jan 19, 2012 at 2:41 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I have the latest release from the Apache Manifold site (i.e. 0.3-incubating). I checked the driver jar file with winzip, and the driver name is still the same (oracle.jdbc.OracleDriver). I'm running java 1.6.0_18-b7 on Windows XP SP 3. On Thu, Jan 19, 2012 at 2:27 PM, Karl Wright daddy...@gmail.com wrote: MCF's Oracle support was written against earlier versions of the Oracle driver. It is possible that they have changed the driver class. If the driver winds up in the dist/connector-lib directory (I'm assuming you are using trunk or 0.4-incubating), then it should be accessible. Could you please try the following: jar -tf ojdbc6.jar | grep oracle/jdbc/OracleDriver ... assuming you are using Linux? If the driver class IS found, then the other possibility is that the jar is compiled against a later version of Java than the one you are using to run MCF. Please let me know what you find. Karl On Thu, Jan 19, 2012 at 1:43 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I downloaded MCF and started playing with the default setup under Jetty and
Re: Cannot find OracleDriver
Sorry. I used the wrong character. It is configured for 21.16.18.145:1521 On Mon, Feb 27, 2012 at 10:27 AM, Karl Wright daddy...@gmail.com wrote: So if the Database and Host field really is 21:16:18:145:1521, try 21.16.18.145:1521 instead. ;-) Karl On Mon, Feb 27, 2012 at 9:22 AM, Matthew Parker mpar...@apogeeintegration.com wrote: type: JDBC Authority: None Database Type: ORACLE Database and Host: 21:16:18:145:1521 Instance/Database: main User Name: Password: X On Sun, Feb 26, 2012 at 2:48 PM, Karl Wright daddy...@gmail.com wrote: I haven't seen this one. I'd love to know what the connect descriptor it refers to is. Can you tell me what the parameters all look like for the JDBC connection you are setting up? Are you specifying, for instance, the port as part of the server name? Karl On Sat, Feb 25, 2012 at 1:22 PM, Matthew Parker mpar...@apogeeintegration.com wrote: Karl, That fixed the driver issue. I just updated my start.jar file by hand for now. The problem I have now is connecting to ORACLE. I can do it through NetBeans on my machine, but I cannot connect through ManfoldCF with the same settings. I get the following error: Error getting connection. Listener refused the connection with the following error. ORA-12514. TNS:Listener does not currently know of service requested in connect descriptor. This might be more of an ORACLE issue than Manifold issue, but I was wondering whether you've encountered the same thing during testing? Regards, Matt On Fri, Jan 20, 2012 at 10:28 AM, Matthew Parker mpar...@apogeeintegration.com wrote: Thanks Karl. On Thu, Jan 19, 2012 at 9:44 PM, Karl Wright daddy...@gmail.com wrote: The problem has been fixed on trunk. Basically, the instructions changed as did some of the build files. It turned out to be extremely challenging to get JDBC drivers to run when they were loaded by anything other than the system classloader, so that's what I was forced to insure. Thanks, Karl On Thu, Jan 19, 2012 at 3:33 PM, Karl Wright daddy...@gmail.com wrote: The ticket for this problem is CONNECTORS-390. Karl On Thu, Jan 19, 2012 at 3:05 PM, Matthew Parker mpar...@apogeeintegration.com wrote: Many thanks. I'll give that a try. On Thu, Jan 19, 2012 at 3:01 PM, Karl Wright daddy...@gmail.com wrote: The problem is that the JDBC driver is using a pool driver that is in common with the core of ManifoldCF. So the connector-lib path, which only the connectors know about, won't do. That's a bug which I'll create a ticket for. A temporary fix, which is slightly involved, requires you to put the ojdbc6.jar in the example/lib area, as you already tried, but in addition you will need to explicitly include the jar in your classpath. Normally the start.jar's manifest describes all the jars in the initial classpath. I thought it was possible to also include additional classpath info through the normal --classpath mechanism, but that doesn't seem to work, so you may be stuck with modifying the root build.xml file to add the jar to the manifest. I'm going to experiment a bit and see if I can come up with something quickly. Karl On Thu, Jan 19, 2012 at 2:48 PM, Karl Wright daddy...@gmail.com wrote: I was able to reproduce the problem. I'll get back to you when I figure out what the issue is. Karl On Thu, Jan 19, 2012 at 2:47 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I've used the jar file in NetBeans to connect to the database without any issue. Seems more like a class loader issue. On Thu, Jan 19, 2012 at 2:41 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I have the latest release from the Apache Manifold site (i.e. 0.3-incubating). I checked the driver jar file with winzip, and the driver name is still the same (oracle.jdbc.OracleDriver). I'm running java 1.6.0_18-b7 on Windows XP SP 3. On Thu, Jan 19, 2012 at 2:27 PM, Karl Wright daddy...@gmail.com wrote: MCF's Oracle support was written against earlier versions of the Oracle driver. It is possible that they have changed the driver class. If the driver winds up in the dist/connector-lib directory (I'm assuming you are using trunk or 0.4-incubating), then it should be accessible. Could you please try the following: jar -tf ojdbc6.jar | grep oracle/jdbc/OracleDriver ... assuming you are using Linux? If the driver class IS found, then the other possibility is that the jar is compiled against a
Re: Transforming Manifold Metadata Prior to Pushing the Data into SOLR
Karl, I'm importing data from a number of sources to include: SharePoint, File shares, and an ORACLE database. The files/records are indexed by SOLR. Right now, some of the import is done through custom SOLR's Data Import Handler facilities. I'm hoping to move away from that in the future. We are also aggregating some of the file share data into custom views on the web client. Lots of preprocessing. All of this is stored in the SOLR index with metadata related as to how to display it within our custom web client. If the result is a certain type, we have custom templates that are display as a result of that. Manifold is a good solution for the SharePoint data. We don't really do any custom processing on it other than strip HTML from the text. It's the database and file share information that adds some challenges. I'm hoping to get SOLR out of the text processing pipeline, and just let it index data. We are moving to Pentaho at some point, and we'll probably handle most of the custom metadata processing there. At some point, we'll possibly integrate Pentaho as an output connection in Manifold. Thanks, Matt On Mon, Feb 27, 2012 at 10:04 AM, Karl Wright daddy...@gmail.com wrote: Please see my response interleaved below. On Mon, Feb 27, 2012 at 9:53 AM, Matthew Parker mpar...@apogeeintegration.com wrote: I'm trying to push data into SOLR.. Is there a way to transform the metadata coming in from different data sources like SharePoint, and the File Share, prior to posting it into SOLR? In general, ManifoldCF does not have data transformation abilities. With Solr, we rely on Solr Cell, which is a pipeline built on Tika, to extract content from documents and to perform transformations to document metadata etc. It is possible that at some point it will be possible to do more transformations in ManifoldCF in order to support search engines that don't have a pipeline, but that is currently not available. For instance, documents have metadata specifying their file path. I need to transform that to a URL I can use within SOLR to retrieve that document through a servlet that I wrote. The ManifoldCF model is that a connector creates a URL for each document that it indexes, using whatever makes sense for that particular repository to get you back to the document in question. So, for instance, Documentum documents will use URLs that point at Documentum's Webtop web application. It would be helpful to understand more precisely what you are trying to do. You could, for instance, modify your servlet to redirect to the ManifoldCF-generated URL. It gets indexed into Solr as the id field. Also, based on specific metadata that I'm seeing in the documents, I might want to conditionally add populate other fields in SOLR index. That sounds like a job for the Tika pipeline to me. Thanks, Karl -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration. -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.
Re: Cannot find OracleDriver
The connect URL it will use given those parameters is the following: String dburl = jdbc: + providerName + // + host + / + database + ((instanceName==null)?:;instance=+instanceName); Or, filled in with your parameters: jdbc:oracle:thin:@//21.16.18.145:1521/main The main at the end is what I would wonder about. Oracle's default is database; if you leave the database/instance name field blank, that's what you'll get. I also recommend turning on connector debugging, in properties.xml, by adding: property name=org.apache.manifoldcf.connectors value=DEBUG/ ... and restarting ManifoldCF. Try viewing the connection in the UI; you should see the connect string logged, as well as possibly a more detailed response. Thanks, Karl On Mon, Feb 27, 2012 at 11:12 AM, Matthew Parker mpar...@apogeeintegration.com wrote: Sorry. I used the wrong character. It is configured for 21.16.18.145:1521 On Mon, Feb 27, 2012 at 10:27 AM, Karl Wright daddy...@gmail.com wrote: So if the Database and Host field really is 21:16:18:145:1521, try 21.16.18.145:1521 instead. ;-) Karl On Mon, Feb 27, 2012 at 9:22 AM, Matthew Parker mpar...@apogeeintegration.com wrote: type: JDBC Authority: None Database Type: ORACLE Database and Host: 21:16:18:145:1521 Instance/Database: main User Name: Password: X On Sun, Feb 26, 2012 at 2:48 PM, Karl Wright daddy...@gmail.com wrote: I haven't seen this one. I'd love to know what the connect descriptor it refers to is. Can you tell me what the parameters all look like for the JDBC connection you are setting up? Are you specifying, for instance, the port as part of the server name? Karl On Sat, Feb 25, 2012 at 1:22 PM, Matthew Parker mpar...@apogeeintegration.com wrote: Karl, That fixed the driver issue. I just updated my start.jar file by hand for now. The problem I have now is connecting to ORACLE. I can do it through NetBeans on my machine, but I cannot connect through ManfoldCF with the same settings. I get the following error: Error getting connection. Listener refused the connection with the following error. ORA-12514. TNS:Listener does not currently know of service requested in connect descriptor. This might be more of an ORACLE issue than Manifold issue, but I was wondering whether you've encountered the same thing during testing? Regards, Matt On Fri, Jan 20, 2012 at 10:28 AM, Matthew Parker mpar...@apogeeintegration.com wrote: Thanks Karl. On Thu, Jan 19, 2012 at 9:44 PM, Karl Wright daddy...@gmail.com wrote: The problem has been fixed on trunk. Basically, the instructions changed as did some of the build files. It turned out to be extremely challenging to get JDBC drivers to run when they were loaded by anything other than the system classloader, so that's what I was forced to insure. Thanks, Karl On Thu, Jan 19, 2012 at 3:33 PM, Karl Wright daddy...@gmail.com wrote: The ticket for this problem is CONNECTORS-390. Karl On Thu, Jan 19, 2012 at 3:05 PM, Matthew Parker mpar...@apogeeintegration.com wrote: Many thanks. I'll give that a try. On Thu, Jan 19, 2012 at 3:01 PM, Karl Wright daddy...@gmail.com wrote: The problem is that the JDBC driver is using a pool driver that is in common with the core of ManifoldCF. So the connector-lib path, which only the connectors know about, won't do. That's a bug which I'll create a ticket for. A temporary fix, which is slightly involved, requires you to put the ojdbc6.jar in the example/lib area, as you already tried, but in addition you will need to explicitly include the jar in your classpath. Normally the start.jar's manifest describes all the jars in the initial classpath. I thought it was possible to also include additional classpath info through the normal --classpath mechanism, but that doesn't seem to work, so you may be stuck with modifying the root build.xml file to add the jar to the manifest. I'm going to experiment a bit and see if I can come up with something quickly. Karl On Thu, Jan 19, 2012 at 2:48 PM, Karl Wright daddy...@gmail.com wrote: I was able to reproduce the problem. I'll get back to you when I figure out what the issue is. Karl On Thu, Jan 19, 2012 at 2:47 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I've used the jar file in NetBeans to connect to the database without any issue. Seems more like a class loader issue. On Thu, Jan 19, 2012 at 2:41 PM, Matthew Parker mpar...@apogeeintegration.com wrote: I have the latest release from the Apache Manifold site (i.e. 0.3-incubating). I
Re: Transforming Manifold Metadata Prior to Pushing the Data into SOLR
Thanks for the insights Karl. I'll have to give this a little more thought. On Mon, Feb 27, 2012 at 1:22 PM, Karl Wright daddy...@gmail.com wrote: If you've got a mix of data and only some of it comes through ManifoldCF, you can still use the ManifoldCF-generated URL for those that originate with ManifoldCF. This should even work for documents from the JCIFS connector - even though the default urls from this connector are file: style, there's a mapping you can set up for documents from that connector that maps to a URL format of your choice. Similarly, most JDBC document urls can readily be constructed as part of the database queries that you provide for the job. So it does not sound like your servlet would have to do anything custom for any of the data that comes from ManifoldCF at this time, as long as you define your connections and jobs with some care as to the URLs they will produce. Thanks, Karl On Mon, Feb 27, 2012 at 11:25 AM, Matthew Parker mpar...@apogeeintegration.com wrote: Karl, I'm importing data from a number of sources to include: SharePoint, File shares, and an ORACLE database. The files/records are indexed by SOLR. Right now, some of the import is done through custom SOLR's Data Import Handler facilities. I'm hoping to move away from that in the future. We are also aggregating some of the file share data into custom views on the web client. Lots of preprocessing. All of this is stored in the SOLR index with metadata related as to how to display it within our custom web client. If the result is a certain type, we have custom templates that are display as a result of that. Manifold is a good solution for the SharePoint data. We don't really do any custom processing on it other than strip HTML from the text. It's the database and file share information that adds some challenges. I'm hoping to get SOLR out of the text processing pipeline, and just let it index data. We are moving to Pentaho at some point, and we'll probably handle most of the custom metadata processing there. At some point, we'll possibly integrate Pentaho as an output connection in Manifold. Thanks, Matt On Mon, Feb 27, 2012 at 10:04 AM, Karl Wright daddy...@gmail.com wrote: Please see my response interleaved below. On Mon, Feb 27, 2012 at 9:53 AM, Matthew Parker mpar...@apogeeintegration.com wrote: I'm trying to push data into SOLR.. Is there a way to transform the metadata coming in from different data sources like SharePoint, and the File Share, prior to posting it into SOLR? In general, ManifoldCF does not have data transformation abilities. With Solr, we rely on Solr Cell, which is a pipeline built on Tika, to extract content from documents and to perform transformations to document metadata etc. It is possible that at some point it will be possible to do more transformations in ManifoldCF in order to support search engines that don't have a pipeline, but that is currently not available. For instance, documents have metadata specifying their file path. I need to transform that to a URL I can use within SOLR to retrieve that document through a servlet that I wrote. The ManifoldCF model is that a connector creates a URL for each document that it indexes, using whatever makes sense for that particular repository to get you back to the document in question. So, for instance, Documentum documents will use URLs that point at Documentum's Webtop web application. It would be helpful to understand more precisely what you are trying to do. You could, for instance, modify your servlet to redirect to the ManifoldCF-generated URL. It gets indexed into Solr as the id field. Also, based on specific metadata that I'm seeing in the documents, I might want to conditionally add populate other fields in SOLR index. That sounds like a job for the Tika pipeline to me. Thanks, Karl -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration. -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.