[
https://issues.apache.org/jira/browse/BLUR-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475774#comment-13475774
]
Gagan Deep Juneja commented on BLUR-18:
---------------------------------------
I tried to simulate our discussion in code. As per my understanding the
getSplits function would somewhat looks like as follows
public List<InputSplit> getSplits(JobContext context) throws IOException,
InterruptedException {
List<InputSplit> splits;
String query = context.getConfiguration().get("query");
ShardServer server = new ShardServer();
Session session = new Session(UUID.randomUUID().toString());
try {
server.addReadSession(session);
server.executeQuery(session, query);
String tableName = getTableName(query);
List<BlurShard> shardList = getShardLayout(tableName);
for(BlurShard shard: shardList){
splits.add(new BlurSplit(shard, session));
}
return splits;
} catch (BlurException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (TException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}
I am looking forward for your comments. I have some confusion on how
ControllerServer, ShardServer and shard would actually implemented in this new
Api.
> Rework the MapReduce Library to implement Input/OutputFromats
> -------------------------------------------------------------
>
> Key: BLUR-18
> URL: https://issues.apache.org/jira/browse/BLUR-18
> Project: Apache Blur
> Issue Type: Improvement
> Reporter: Aaron McCurry
>
> Currently the only way to implement indexing is to use the BlurReducer. A
> better way to implement this would be to support Hadoop input/outputformats
> in both the new and old api's. This would allow an easier integration with
> other Hadoop projects such as Hive and Pig.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira