[
https://issues.apache.org/jira/browse/BLUR-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475589#comment-13475589
]
Gagan Deep Juneja commented on BLUR-18:
---------------------------------------
I saw new-api-prototype, this is very much simpler than older api. I am
considering InputFormat first and have following questions.
1. As per my understanding for Search query our Mapper will have shard of that
table as InputSplit. If this assumption is correct then how our InputSplit will
look like? does it has all the indexes in memory? my basic question is what
getSplits method will return?
2. We will have BlurShard server corresponding to each shard?
3. As per BlurShard service we have methods for query execution and for getting
results after query execution so how the MR fix into this scenario?
4 I found new term "tuple" in service.thrift definition, does it replacement of
old term "record"?
5 what is the difference between BlurTuple service and BlurShard service
because methods are almost same. Does BlurTuple work as controller service?
> Rework the MapReduce Library to implement Input/OutputFromats
> -------------------------------------------------------------
>
> Key: BLUR-18
> URL: https://issues.apache.org/jira/browse/BLUR-18
> Project: Apache Blur
> Issue Type: Improvement
> Reporter: Aaron McCurry
>
> Currently the only way to implement indexing is to use the BlurReducer. A
> better way to implement this would be to support Hadoop input/outputformats
> in both the new and old api's. This would allow an easier integration with
> other Hadoop projects such as Hive and Pig.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira