Karl Wright created CONNECTORS-968:
--
Summary: All output connectors should be updated so they can
coexist with each other in the UI
Key: CONNECTORS-968
URL: https://issues.apache.org/jira/browse/CONNECTORS-968
[
https://issues.apache.org/jira/browse/CONNECTORS-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14035514#comment-14035514
]
Karl Wright commented on CONNECTORS-967:
Looks good -- please go ahead and
[
https://issues.apache.org/jira/browse/CONNECTORS-967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shinichiro Abe resolved CONNECTORS-967.
---
Resolution: Fixed
Fix Version/s: ManifoldCF 1.7
Assignee:
Karl Wright created CONNECTORS-969:
--
Summary: Output connectors that use a JSON description string
likely to fail
Key: CONNECTORS-969
URL: https://issues.apache.org/jira/browse/CONNECTORS-969
But guys, why not simply pass to a classic SolrJ SolrDocument creation and
ingestion in the Solr Server ? Easy and Straighforward !
In the end at that point the RepositoryDocument will me only a Map of
metadata and values.
Content will be part of that, so I guess the conversion to a SolrDocument
Hello Karl,
What i was thinking is:
assuming we have the Tika Connector, the responsibility to extract content
will pass from Solr to the Tika processor.
So we can change the part in the Solr Connector that manages the building
of the request to send to the Extract update handler.
Particularly
Karl Wright created CONNECTORS-970:
--
Summary: Hadoop error and silent failure
Key: CONNECTORS-970
URL: https://issues.apache.org/jira/browse/CONNECTORS-970
Project: ManifoldCF
Issue Type:
Hi Alessandro,
ideally I think that text extraction from rich documents should be
Manifold responsibility, not Solr's
So the ideal place to implement it would be in the new document processing
pipeline (using Tika)
--
Matteo Grolla
Sourcesense - making sense of Open Source
Hi Alessandro,
The reason for backwards compatibility is obvious: people upgrade
ManifoldCF all the time, and when they do it should not stop working for
them.
Putting Tika all the time in the pipeline is also not appropriate for other
output connections. Even if you did it just for Solr, you'd
[
https://issues.apache.org/jira/browse/CONNECTORS-968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright resolved CONNECTORS-968.
Resolution: Fixed
various individual commits
All output connectors should be updated
2014-06-18 16:10 GMT+01:00 Karl Wright daddy...@gmail.com:
Hi Alessandro,
The reason for backwards compatibility is obvious: people upgrade
ManifoldCF all the time, and when they do it should not stop working for
them.
Ok i agree !
Putting Tika all the time in the pipeline is also not
Hi all,
By now it is becoming clear that ManifoldCF has accumulated a lot of
backwards-compatibility dead weight we have to carry around from release to
release. However, ManifoldCF 2.0 will present an opportunity to break
backwards compatibility with the 1.x releases. Originally, I was
Hi Karl,
Big +1 to making 2.0 our next release.
My suggestions :
* Looks like discussion is ongoing but Lets assume 2.0 will be next release and
consider switching to tika transformerÂ
as in : http://searchhub.org/2012/02/14/indexing-with-solrj/
* Lets make SharePoint 2010 default value in
Good suggestions!
Would you be willing to create Jira tickets for these, and make the Fix In
Version field be 2.0? Thanks in advance!
Karl
On Wed, Jun 18, 2014 at 1:10 PM, Ahmet Arslan iori...@yahoo.com.invalid
wrote:
Hi Karl,
Big +1 to making 2.0 our next release.
My suggestions :
*
Since a Tika transformer is critical to this plan, I'm going to code one up
now. Stay tuned!
Karl
On Wed, Jun 18, 2014 at 11:59 AM, Karl Wright daddy...@gmail.com wrote:
bq. I don't agree on this. Why is not appropriate for all the connectors ?
Some output connectors want the document in
+1 from me for breaking backwords compatibility and focusing on non-SQL
data store.
Piergiorgio
2014-06-18 18:19 GMT+02:00 Karl Wright daddy...@gmail.com:
Hi all,
By now it is becoming clear that ManifoldCF has accumulated a lot of
backwards-compatibility dead weight we have to carry
Hi Piergiorgio,
Just to clarify -- I don't have a workable plan yet for a non-SQL data
store, so maybe that waits until 3.0.
Karl
On Wed, Jun 18, 2014 at 3:13 PM, Piergiorgio Lucidi piergior...@apache.org
wrote:
+1 from me for breaking backwords compatibility and focusing on non-SQL
data
Hi,
What is non-SQL data store ? You mean to remove MFC's dependency to
PostgreSQL, MySQL, Derby etc?
By the way solr guys are looking for a Data Import Handler (DIH) replacement.
See for the thread : http://search-lucene.com/m/WwzTb2z1w7F
DIH is mostly used to sync RDBMS to Solr.
What do
bq. What is non-SQL data store ? You mean to remove MFC's dependency to
PostgreSQL, MySQL, Derby etc?
See CONNECTORS-286.
bq. What do you think about this? Can MCF be dih replacement? How is our
DB crawler compared to DIH?
In theory it could. I'd hesitate before claiming feature-to-feature
Hi Muhammed,
Can you go into more depth about these:
1) Sharding support
2) Selectable seeding model.
Thanks,
Karl
On Wed, Jun 18, 2014 at 5:38 PM, Karl Wright daddy...@gmail.com wrote:
bq. What is non-SQL data store ? You mean to remove MFC's dependency to
PostgreSQL, MySQL, Derby etc?
[
https://issues.apache.org/jira/browse/CONNECTORS-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036471#comment-14036471
]
Karl Wright commented on CONNECTORS-954:
Committed the basic connector:
[
https://issues.apache.org/jira/browse/CONNECTORS-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036739#comment-14036739
]
Karl Wright commented on CONNECTORS-954:
Added the field mapping tab: r1603687
22 matches
Mail list logo