Hi Vijay, Welcome to the Drill community! The questions you have are common, answers below are inline
1. Drill is an incubation project, Is anyone using for production applications? There are a number of organization that are starting to deploy Drill for evaluation purposes. These deployments are querying large amounts of real data, but as far as we know there are no real production deployments today. Large portions of Drill are well tested, but we are still working on remaining bugs as we continue to work towards a stable 1.0 release. 2. Data repository recommendation: I have the source data as relational and want to perform complex adhoc queries involving joins and aggregates. Any recommendation on the data repository for better performance. The fastest format we support currently is Parquet a columnar file format currently under incubation in Apache, we support JSON, delimited text and any format with a Hive SerDe available, but this read path had not been optimized as much. 3. Encrypted Data: Does drill works against encrypted data? Any documentation around it would be helpful? As far as I know there is no support for encrypted data. 4. Concurrent Queries: As I expect 100s of users running against the drill query engine. Is there any limitation on number of queries running against drill? There is no strict limit on number of users or queries that can be run at any time. Drill's architecture is designed to be highly scalable. There is no single bottleneck to limit the number of concurrent connections, as any node in a Drill cluster can act as the head node for a query. Different clients can connect to different nodes to spread the query planning burden throughout the cluster. Obviously the physical operators are also spread around the cluster, and we are actively working on better management of memory for individual fragments of execution plans to allow for more concurrent queries to run with limited resources. On Fri, Oct 24, 2014 at 10:56 AM, arorav <[email protected]> wrote: > Hi All > > I am new to Drill and have few questions: > > > 1. Drill is an incubation project, Is anyone using for production > applications? > > 2. Data repository recommendation: I have the source data as > relational and want to perform complex adhoc queries involving joins and > aggregates. Any recommendation on the data repository for better > performance. > > 3. Encrypted Data: Does drill works against encrypted data? Any > documentation around it would be helpful? > > 4. Concurrent Queries: As I expect 100s of users running against the > drill query engine. Is there any limitation on number of queries running > against drill? > > > > > All help would be appreciated. > > Thanks > > > Vijay Arora > >
