Hi all,

I have apache HTTP server logs in different machines and want to query
these log files.

So I  install the drill (distributed mode) in these machines, for example,
node1,node2.

I use  this command:
sqlline –u jdbc:drill:zk:node1,node2
or
sqlline –u jdbc:drill:drillbit:node1,node2

then input query like: select count(*) from dfs.`/apache/logs/access_log`
I could only get the data of one machine.

Maybe I can upload all logs file to s3 or Hadoop.
But is there an easy way to query all local files in different machines by
drill?

If we need develop the new features to support this requirement, How much
work we should do?  for example, only revise the physical plan distribution
codes? or need write the completely new data source plugin?

I found these discussions, but seems no clear answer.

https://stackoverflow.com/questions/29365320/apache-drill-in-distributed-mode

http://mail-archives.apache.org/mod_mbox/drill-user/201506.mbox/thread

https://stackoverflow.com/questions/33952568/how-to-configure-drill-to-use-all-the-nodes-for-a-query-by-creating-multiple-fr

Thanks,

Wang Liang

Reply via email to