my data files could get big. Is Drill Spark integration a solution in that
case?
Drill remains a solution if your data gets big because it scales horizontally like Spark. You will have to replace the Windows Desktop folder with some scalable, network enabled storage, however, irrespective of which query engine you choose. Neither Drill nor Spark provide a storage layer themselves but compatible options include HDFS and S3.

After setting the workspace to query the file system, how to execute such
query in Java syntax?
Drill only runs queries written in SQL. You can send that SQL from your Java application to Drill using JDBC or Drill's REST API. If you prefer to generate the SQL from object oriented Java expressions, take a look at jOOQ <https://www.jooq.org/>. There might be a little dialect work required to make jOOQ fully compatible with Drill but (a) we'd be prepared to help you with that and (b) Drill's SQL dialect is by and large vanilla ANSI SQL:2003.

Regards
James

On 2022/11/25 09:54, marc nicole wrote:
Hi,

After setting the workspace to query the file system, how to execute such
query in Java syntax?

Le ven. 25 nov. 2022 à 02:25, Charles Givre<[email protected]>  a écrit :

Hi Marc,
I should have asked, are you running Drill on a single windows machine?
If so, Drill will be able to query anything you throw at it.  If your data
starts to get bigger than a single machine can handle, you'll need to set
up a Drill cluster with multiple nodes.  This is no different than Spark. I
would suggest using Drill to convert the data to parquet format.  Often you
can achieve a 10x reduction in file size and extreme improvements in query
speed.

As for configuring Drill, take a look here:
https://drill.apache.org/docs/workspaces/.   This explains how to set up
a workspace. What you'll want to do is set the workspace to the path to
your desktop.   Then you can query the files as noted below.
Best,
-- C





On Nov 24, 2022, at 6:05 PM, marc nicole<[email protected]>  wrote:

also how to execute such queries as  SELECT *
FROM dfs.desktop.`file.json` in Java ?

Le jeu. 24 nov. 2022 à 23:31, Charles Givre<[email protected]>  a écrit :

Hi Marc,
Welcome to Drill!  Firstly, take a look at the docs for querying a file
system:
https://drill.apache.org/docs/querying-a-file-system-introduction/
When you start up drill out of the box, there is a connector called dfs
which points to the local filesystem.  You can configure a workspace to
your desktop folder, then all you have to do is write a query like:

SELECT *
FROM dfs.desktop.`file.json`

If you're looking to do this programmatically from Java and your data
isn't too big, the easiest way is probably to use Drill's REST API (
https://drill.apache.org/docs/rest-api-introduction/).  You can make a
simple HtTP call to Drill and get the data that way.

Hope this helps!
-- C



On Nov 24, 2022, at 5:02 PM, marc nicole<[email protected]>  wrote:

Hi,

I want to query a JSON file placed in Desktop folder (Windows).
How to do that in Java ?

PS: i saw this type of code :

Connection con = null;

     con = new Driver().connect(DRILL_JDBC_LOCAL_URI,
getDefaultProperties());
     Statement stmt = con.createStatement();
     ResultSet rs = stmt.executeQuery(DRILL_SAMPLE_QUERY);...


But that requires using JDBC and to place JSON in jar file within CP of
Drill which i don't want;

Thanks.


Reply via email to