Re: Python Flight example with query command

2021-05-17 Thread Tanveer Ahmad - EWI
Hi David,


Thank you for the reply.


I have found that Arrow 
Datafusion<https://github.com/apache/arrow-datafusion/tree/3be087a78846beffdbc4a9f80c73938fa18d24a7/datafusion-examples/examples#distributed>
 project offers something similar for what I am looking for. Do you think this 
project implements FlightSQL proposal?


Regards,
Tanveer Ahmad

From: David Li 
Sent: Saturday, May 15, 2021 3:10:53 PM
To: dev@arrow.apache.org
Subject: Re: Python Flight example with query command

Hey Tanveer,

Something like this should work:

$ python examples/flight/client.py put localhost:1234 foo.csv
File Name: foo.csv
Table rows= 1
   a  b
0  1  2
$ python examples/flight/client.py get localhost:1234 -p foo.csv
Ticket: 

   a  b
0  1  2

Note that Flight itself does not implement SQL query functionality or
anything of the sort. It is a common misconception, I think
exacerbated since Flight is often discussed in the context of products
like Dremio which implement such functionality on top of Flight. But
really, Flight itself is just a 'dumb pipe' for Arrow data for
building such systems.

You may be interested in the FlightSQL proposal which defines at least
an interface for database systems to make themselves available over
Flight and for clients to generically query them. However that
proposal has been stalled for a while.

Best,
David

On 2021/05/15 12:15:26, Tanveer Ahmad - EWI  wrote:
> Hi all,
>
>
> For Python Flight 
> example<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_arrow_tree_master_python_examples_flight=DwIBaQ=XYzUhXBD2cD-CornpT4QE19xOJBbRy-TBPLK0X9U2o8=0FbbJetCCSYzJEnEDCQ1rNv76vTL6SUFCukKhvNosPs=-rzSkXp2WuWkj32SFKyE8CQ6Z6ViEXHg1AnYkaBmj4E=Q0ypcfj8NSFdgVXa_c8NxxQ9dpVkQ1EQXE157BX3R7g=
>  >, I can start server (python server.py -> Serving on 
> grpc+tcp://localhost:5005) and client can put (python client.py put 
> localhost:5005 mycsv.csv) and also get (python client.py get localhost:5005 
> -p mycsv.csv) command retrieves data with -p (path) option.
>
>
> I am wondering how to query (like python client.py get localhost:5005 -c 
> "select * from ? limit 10") using -c, command this data , which I had already 
> put on server through put command.
>
>
> Thanks.
>
> Regards,
> Tanveer Ahmad
>
>


Python Flight example with query command

2021-05-15 Thread Tanveer Ahmad - EWI
Hi all,


For Python Flight 
example, I 
can start server (python server.py -> Serving on grpc+tcp://localhost:5005) and 
client can put (python client.py put localhost:5005 mycsv.csv) and also get 
(python client.py get localhost:5005 -p mycsv.csv) command retrieves data with 
-p (path) option.


I am wondering how to query (like python client.py get localhost:5005 -c 
"select * from ? limit 10") using -c, command this data , which I had already 
put on server through put command.


Thanks.

Regards,
Tanveer Ahmad



Arrow Flight in C

2021-04-08 Thread Tanveer Ahmad - EWI
Hi,


I need some help in sending Arrow RecordBatches over Arrow Flight inside a C 
application. As there is no interface for Arrow Flight is available for Arrow 
CGlib. Does someone have some custom C interface/suggestions to use C++ 
functions for Arrow Flight inside a C application. Thanks.



Regards,

Tanveer Ahmad



Re: Flight API for c_glib

2020-03-29 Thread Tanveer Ahmad - EWI
Thanks Kou for reply.


Regards,
Tanveer Ahmad

From: Sutou Kouhei 
Sent: Thursday, March 26, 2020 9:29:28 PM
To: dev@arrow.apache.org
Subject: Re: Flight API for c_glib

Hi,

In 
  "Flight API for c_glib" on Thu, 26 Mar 2020 16:01:25 +,
  Tanveer Ahmad - EWI  wrote:

> I am wondering whether some work is being done on Flight API for c_glib?

We'll do in a few months.


Thanks,
--
kou


Flight API for c_glib

2020-03-26 Thread Tanveer Ahmad - EWI
Hi,


I am wondering whether some work is being done on Flight API for c_glib?


Regards,
Tanveer Ahmad


Re: In-memory sorting of plasma objects

2019-08-23 Thread Tanveer Ahmad - EWI
Thank you Wes. I see.


Regards,
Tanveer Ahmad


From: Wes McKinney 
Sent: Thursday, August 22, 2019 5:12:06 PM
To: dev@arrow.apache.org
Subject: Re: In-memory sorting of plasma objects

hi Tanveer,

IIUC there is logic for moving data that's managed by Plasma servers
between nodes in the Ray project (https://github.com/ray-project/ray)
--if you need to move the bytes from one node to another you need to
use some kind of messaging / RPC tool. The Ray developers might have
some advice -- I think their implementation is specific to Ray's
internals which is why we don't have this implemented (yet) natively
in Apache Arrow

- Wes

On Thu, Aug 22, 2019 at 8:34 AM Tanveer Ahmad - EWI  wrote:
>
> Hi,
>
>
> I need some help regarding data exchange between Arrow based plasma shared 
> memory objects on cluster nodes.
>
>
> I have two Plasma shared memory objects each contains a RecordBatch on 
> different nodes of a cluster.
>
> I want to use pandas dataframes or something like that (dask) on a single 
> node to sort them together. Is there any way to access these Plasma objects 
> on a single node and sort them in-memory?
>
> Thanks.
>
>
> Regards,
> Tanveer Ahmad


In-memory sorting of plasma objects

2019-08-22 Thread Tanveer Ahmad - EWI
Hi,


I need some help regarding data exchange between Arrow based plasma shared 
memory objects on cluster nodes.


I have two Plasma shared memory objects each contains a RecordBatch on 
different nodes of a cluster.

I want to use pandas dataframes or something like that (dask) on a single node 
to sort them together. Is there any way to access these Plasma objects on a 
single node and sort them in-memory?

Thanks.


Regards,
Tanveer Ahmad


Re: Java OutOfMemoryException!

2019-03-24 Thread Tanveer Ahmad - EWI
Thanks Razvan.

Increasing the RootAllocator limit resolved the problem.


Regards,
Tanveer Ahmad


From: Razvan Chitu 
Sent: Sunday, March 24, 2019 2:44:57 PM
To: dev@arrow.apache.org
Cc: u...@arrow.apache.org
Subject: Re: Java OutOfMemoryException!

Hi Tanveer,

The stack trace seems to indicate that you you've breached the limit of the
allocator used by the ArrowStreamReader, so that's where I'd look first.
The limit is usually set when constructing an allocator (e.g. new
RootAllocator(myLimit)) or when getting a child allocator (e.g.
rootAllocator.newChildAllocator(...)).

Razvan



On Sun, Mar 24, 2019 at 12:33 PM Tanveer Ahmad - EWI 
wrote:

> Hi,
>
> I am de-serializing multiple plasma objects in java at the same time,
> everything is working fine but when the data size increases the following
> error is being occurred for some threads. Any suggestion where I can
> increase/change the memory allocation for these processes (I have more
> memory available)? Is it JVM related or Arrow specific?
>
> Exception in thread "Thread-1"
> org.apache.arrow.memory.OutOfMemoryException: Unable to allocate buffer of
> size 634729984 due to memory limit. Current allocation: 0
> at
> org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:273)
> at
> org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:249)
> at
> org.apache.arrow.vector.ipc.message.MessageChannelReader.readMessageBody(MessageChannelReader.java:88)
> at
> org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeRecordBatch(MessageSerializer.java:204)
> at
> org.apache.arrow.vector.ipc.ArrowStreamReader.loadNextBatch(ArrowStreamReader.java:116)
>
>
>
> Thanks.
>
>
> Regards,
> Tanveer Ahmad
>


Java OutOfMemoryException!

2019-03-24 Thread Tanveer Ahmad - EWI
Hi,

I am de-serializing multiple plasma objects in java at the same time, 
everything is working fine but when the data size increases the following error 
is being occurred for some threads. Any suggestion where I can increase/change 
the memory allocation for these processes (I have more memory available)? Is it 
JVM related or Arrow specific?

Exception in thread "Thread-1" org.apache.arrow.memory.OutOfMemoryException: 
Unable to allocate buffer of size 634729984 due to memory limit. Current 
allocation: 0
at org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:273)
at org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:249)
at 
org.apache.arrow.vector.ipc.message.MessageChannelReader.readMessageBody(MessageChannelReader.java:88)
at 
org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeRecordBatch(MessageSerializer.java:204)
at 
org.apache.arrow.vector.ipc.ArrowStreamReader.loadNextBatch(ArrowStreamReader.java:116)



Thanks.


Regards,
Tanveer Ahmad


Parquet format in Java

2018-10-19 Thread Tanveer Ahmad - EWI
Hi,

In Java, I'm getting plasma object from C++ (in parquet format) as byte[] 
buffer. How can I convert it to back to Arrow Schema/columns? Thanks.

--
Regards,
Tanveer Ahmad


RE: parquet-column_scanner-test failure

2018-10-11 Thread Tanveer Ahmad - EWI
Hi Uwe,

Here its: https://unsee.cc/9f88adf1/

After commenting out all parquet tests, I was able to build it. 

Regards,
--
Tanveer Ahmad
PhD Student
Computer Engineering Laboratory,
Department of Quantum & Computer Engineering
EEMCS, TU Delft, The Netherlands

From: Uwe L. Korn [uw...@xhochy.com]
Sent: Thursday, October 11, 2018 2:43 PM
To: dev@arrow.apache.org
Subject: Re: parquet-column_scanner-test failure

Hello Tanveer,

your attachment did not come through as attachments are not allowed on the 
mailing list. Can you post it somewhere?

Uwe

On Thu, Oct 11, 2018, at 12:33 PM, Tanveer Ahmad - EWI wrote:
> Hi,
>
> I enabled following flags and got error in the attachment (parquet-
> column_scanner-test failure) in making arrow build 11.
>
> cmake .. -DCMAKE_BUILD_TYPE=Release -DARROW_PARQUET=ON -DARROW_PLASMA=ON
> -DARROW_PLASMA_JAVA_CLIENT=ON
>
> Any help in this regard? Thanks.
>
> Regards,
> --
> Tanveer Ahmad
> PhD Student
> Computer Engineering Laboratory,
> Department of Quantum & Computer Engineering
> EEMCS, TU Delft, The Netherlands


parquet-column_scanner-test failure

2018-10-11 Thread Tanveer Ahmad - EWI
Hi,

I enabled following flags and got error in the attachment 
(parquet-column_scanner-test failure) in making arrow build 11.

cmake .. -DCMAKE_BUILD_TYPE=Release -DARROW_PARQUET=ON -DARROW_PLASMA=ON 
-DARROW_PLASMA_JAVA_CLIENT=ON

Any help in this regard? Thanks.

Regards,
-- 
Tanveer Ahmad
PhD Student
Computer Engineering Laboratory,
Department of Quantum & Computer Engineering
EEMCS, TU Delft, The Netherlands