date:20180509

[jira] [Created] (ARROW-2564) C++ Rowwise Tutorial is out of date

2018-05-09 Thread Kendall Willets (JIRA)

Kendall Willets created ARROW-2564:
--

 Summary: C++ Rowwise Tutorial is out of date
 Key: ARROW-2564
 URL: https://issues.apache.org/jira/browse/ARROW-2564
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.9.0
Reporter: Kendall Willets
 Attachments: sample.cpp

Copying code from the tutorial results in a compile error.  

Investigation shows that Maketable no longer exists:

[ARROW-1341](https://issues.apache.org/jira/browse/ARROW-1341) - [C++] 
Deprecate arrow::MakeTable

Attached is a working version which may be either copied into the tutorial or 
added as a sample.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

RE: Question about streaming to memorymapped files

2018-05-09 Thread Ambalu, Robert

I don’t use the output stream objects directly though right? Just to take a 
step back a bit, what im trying to do is to generate streaming rows to a table 
in realtime ( with the ability to control how many rows to batch up before 
writing out a recordbatch )

My understanding is that to properly stream table data I need to:
a) create an outputstream instance
b) create a RecordBatchStreamWriter binding my strmea object to it
c) create a RecordBatchBuilder.  As rows are added, add it to the record batch 
builder.  When we're ready to flush, call Flust on the batchbuilder to create a 
record batch and pass the batch to the RecordBatchStreamWriter.

I was hoping use MemoryMappedFile for a but since it doesn’t support 
dynamically growing the mmap file I'll have to write my own impl

-Original Message-
From: Antoine Pitrou [mailto:anto...@python.org] 
Sent: Wednesday, May 09, 2018 11:42 AM
To: dev@arrow.apache.org
Subject: Re: Question about streaming to memorymapped files


As for buffering data before making a call to write(): in Arrow 0.10.0
you'll be able to use BufferedOutputStream for this:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_arrow_blob_master_cpp_src_arrow_io_buffered.h=DwIDaQ=f5Q7ov8zryUUIGT55zpGgw=saGHLviPO9fhScNR4CP81xeAZv0qydj6cD5eJs7fZG4=JPb2EN-IHSoqJKmEqn-rC7CorVXLSWxcrywaUrMYYzc=1E4T4kTw88QvpO9Bk2GiADuArl_rn72Up4EXqHGwCnk=

Regards

Antoine.


Le 09/05/2018 à 17:39, Ambalu, Robert a écrit :
> I don’t have any offhand, no, but I would imagine that direct file writes 
> will at some point need to make a system call, which is expensive ( fwrite 
> might buffer before eventually making the sys call, looks like 
> FileOutputStream uses the raw system write for every write call).
> The current MMap io interface isn’t usable as a streaming output 
> unfortunately, though I suppose I could just implement my own
> 
> -Original Message-
> From: Antoine Pitrou [mailto:solip...@pitrou.net] 
> Sent: Wednesday, May 09, 2018 11:11 AM
> To: dev@arrow.apache.org
> Subject: Re: Question about streaming to memorymapped files
> 
> 
> Do you know of any benchmark numbers / performance studies about this?
> While it's true that a memory-mapped file avoids explicit system calls,
> I've heard file I/O is quite well optimized, at least on Linux,
> nowadays.
> 
> Regards
> 
> Antoine.
> 
> 
> On Wed, 9 May 2018 14:47:53 +
> "Ambalu, Robert"  wrote:
>> Antoine, thanks for the quick reply.
>> You can actually grow memorymapped files with a mremap call ( and I think a 
>> seek/write on the file ), I do this in my applications and it works fine.
>> I want the efficiency of writing via memory maps, so would prefer to avoid 
>> FileOutputStream
>>
>> -Original Message-
>> From: Antoine Pitrou [mailto:anto...@python.org] 
>> Sent: Wednesday, May 09, 2018 10:37 AM
>> To: dev@arrow.apache.org
>> Subject: Re: Question about streaming to memorymapped files
>>
>>
>> Hi,
>>
>> If you don't know the output size upfront then should probably use a
>> FileOutputStream instead.  By definition, memory mapped files must have
>> a fixed size (since they are mapped to a fixed area in virtual memory).
>>
>> Regards
>>
>> Antoine.
>>
>>
>> Le 09/05/2018 à 16:31, Ambalu, Robert a écrit :
>>> Hey, I'm looking into streaming table updates into a memory mapped file ( 
>>> C++ )
>>> I think I have everything I need ( MemoryMappedFile output streamer, 
>>> RecordBatchStreamWriter ) but I don't understand how to properly create the 
>>> memmap file.  It looks like it requires you to preset a size to the file 
>>> when you create it, but since ill be streaming I don't actually know how 
>>> big a file im going to need...
>>> Am I missing some other API point here?  Any reason why size is required up 
>>> front and the memmap doesn't auto-grow as needed?
>>>
>>> Thanks in advance
>>> - Rob
>>>
>>>
>>>
>>>
>>>
>>> DISCLAIMER: This e-mail message and any attachments are intended solely for 
>>> the use of the individual or entity to which it is addressed and may 
>>> contain information that is confidential or legally privileged. If you are 
>>> not the intended recipient, you are hereby notified that any dissemination, 
>>> distribution, copying or other use of this message or its attachments is 
>>> strictly prohibited. If you have received this message in error, please 
>>> notify the sender immediately and permanently delete this message and any 
>>> attachments.
>>>
>>>
>>>
>>>   
>

Re: Question about streaming to memorymapped files

2018-05-09 Thread Antoine Pitrou


As for buffering data before making a call to write(): in Arrow 0.10.0
you'll be able to use BufferedOutputStream for this:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/buffered.h

Regards

Antoine.


Le 09/05/2018 à 17:39, Ambalu, Robert a écrit :
> I don’t have any offhand, no, but I would imagine that direct file writes 
> will at some point need to make a system call, which is expensive ( fwrite 
> might buffer before eventually making the sys call, looks like 
> FileOutputStream uses the raw system write for every write call).
> The current MMap io interface isn’t usable as a streaming output 
> unfortunately, though I suppose I could just implement my own
> 
> -Original Message-
> From: Antoine Pitrou [mailto:solip...@pitrou.net] 
> Sent: Wednesday, May 09, 2018 11:11 AM
> To: dev@arrow.apache.org
> Subject: Re: Question about streaming to memorymapped files
> 
> 
> Do you know of any benchmark numbers / performance studies about this?
> While it's true that a memory-mapped file avoids explicit system calls,
> I've heard file I/O is quite well optimized, at least on Linux,
> nowadays.
> 
> Regards
> 
> Antoine.
> 
> 
> On Wed, 9 May 2018 14:47:53 +
> "Ambalu, Robert"  wrote:
>> Antoine, thanks for the quick reply.
>> You can actually grow memorymapped files with a mremap call ( and I think a 
>> seek/write on the file ), I do this in my applications and it works fine.
>> I want the efficiency of writing via memory maps, so would prefer to avoid 
>> FileOutputStream
>>
>> -Original Message-
>> From: Antoine Pitrou [mailto:anto...@python.org] 
>> Sent: Wednesday, May 09, 2018 10:37 AM
>> To: dev@arrow.apache.org
>> Subject: Re: Question about streaming to memorymapped files
>>
>>
>> Hi,
>>
>> If you don't know the output size upfront then should probably use a
>> FileOutputStream instead.  By definition, memory mapped files must have
>> a fixed size (since they are mapped to a fixed area in virtual memory).
>>
>> Regards
>>
>> Antoine.
>>
>>
>> Le 09/05/2018 à 16:31, Ambalu, Robert a écrit :
>>> Hey, I'm looking into streaming table updates into a memory mapped file ( 
>>> C++ )
>>> I think I have everything I need ( MemoryMappedFile output streamer, 
>>> RecordBatchStreamWriter ) but I don't understand how to properly create the 
>>> memmap file.  It looks like it requires you to preset a size to the file 
>>> when you create it, but since ill be streaming I don't actually know how 
>>> big a file im going to need...
>>> Am I missing some other API point here?  Any reason why size is required up 
>>> front and the memmap doesn't auto-grow as needed?
>>>
>>> Thanks in advance
>>> - Rob
>>>
>>>
>>>
>>>
>>>
>>> DISCLAIMER: This e-mail message and any attachments are intended solely for 
>>> the use of the individual or entity to which it is addressed and may 
>>> contain information that is confidential or legally privileged. If you are 
>>> not the intended recipient, you are hereby notified that any dissemination, 
>>> distribution, copying or other use of this message or its attachments is 
>>> strictly prohibited. If you have received this message in error, please 
>>> notify the sender immediately and permanently delete this message and any 
>>> attachments.
>>>
>>>
>>>
>>>   
>

RE: Question about streaming to memorymapped files

2018-05-09 Thread Ambalu, Robert

I don’t have any offhand, no, but I would imagine that direct file writes will 
at some point need to make a system call, which is expensive ( fwrite might 
buffer before eventually making the sys call, looks like FileOutputStream uses 
the raw system write for every write call).
The current MMap io interface isn’t usable as a streaming output unfortunately, 
though I suppose I could just implement my own

-Original Message-
From: Antoine Pitrou [mailto:solip...@pitrou.net] 
Sent: Wednesday, May 09, 2018 11:11 AM
To: dev@arrow.apache.org
Subject: Re: Question about streaming to memorymapped files


Do you know of any benchmark numbers / performance studies about this?
While it's true that a memory-mapped file avoids explicit system calls,
I've heard file I/O is quite well optimized, at least on Linux,
nowadays.

Regards

Antoine.


On Wed, 9 May 2018 14:47:53 +
"Ambalu, Robert"  wrote:
> Antoine, thanks for the quick reply.
> You can actually grow memorymapped files with a mremap call ( and I think a 
> seek/write on the file ), I do this in my applications and it works fine.
> I want the efficiency of writing via memory maps, so would prefer to avoid 
> FileOutputStream
> 
> -Original Message-
> From: Antoine Pitrou [mailto:anto...@python.org] 
> Sent: Wednesday, May 09, 2018 10:37 AM
> To: dev@arrow.apache.org
> Subject: Re: Question about streaming to memorymapped files
> 
> 
> Hi,
> 
> If you don't know the output size upfront then should probably use a
> FileOutputStream instead.  By definition, memory mapped files must have
> a fixed size (since they are mapped to a fixed area in virtual memory).
> 
> Regards
> 
> Antoine.
> 
> 
> Le 09/05/2018 à 16:31, Ambalu, Robert a écrit :
> > Hey, I'm looking into streaming table updates into a memory mapped file ( 
> > C++ )
> > I think I have everything I need ( MemoryMappedFile output streamer, 
> > RecordBatchStreamWriter ) but I don't understand how to properly create the 
> > memmap file.  It looks like it requires you to preset a size to the file 
> > when you create it, but since ill be streaming I don't actually know how 
> > big a file im going to need...
> > Am I missing some other API point here?  Any reason why size is required up 
> > front and the memmap doesn't auto-grow as needed?
> > 
> > Thanks in advance
> > - Rob
> > 
> > 
> > 
> > 
> > 
> > DISCLAIMER: This e-mail message and any attachments are intended solely for 
> > the use of the individual or entity to which it is addressed and may 
> > contain information that is confidential or legally privileged. If you are 
> > not the intended recipient, you are hereby notified that any dissemination, 
> > distribution, copying or other use of this message or its attachments is 
> > strictly prohibited. If you have received this message in error, please 
> > notify the sender immediately and permanently delete this message and any 
> > attachments.
> > 
> > 
> > 
> >

[jira] [Created] (ARROW-2563) [Rust] Poor caching in Travis-CI

2018-05-09 Thread Antoine Pitrou (JIRA)

Antoine Pitrou created ARROW-2563:
-

 Summary: [Rust] Poor caching in Travis-CI
 Key: ARROW-2563
 URL: https://issues.apache.org/jira/browse/ARROW-2563
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration, Rust
Reporter: Antoine Pitrou


Since the Rust project isn't at the repo root, Travis-CI won't compiled cache 
artifacts by default. This leads to long CI times as all packages get 
recompiled (see https://docs.travis-ci.com/user/caching/#Rust-Cargo-cache for 
what gets cached).

In https://travis-ci.org/pitrou/arrow/jobs/376859806 I tried the following:
{code}
export CARGO_TARGET_DIR=$TRAVIS_BUILD_DIR/target
{code}

and after a first run, the build time went down to 2 minutes (from 15-18 
minutes).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

file-system specification

2018-05-09 Thread Martin Durant

I have sketched out a possible start of a python-wide file-system specification
https://github.com/martindurant/filesystem_spec

This came about from my work in some other (remote) file-systems 
implementations for python, particularly in the context of Dask. Since arrow 
also cares about both local files and, for example, hdfs, I thought that people 
on this list may have comments and opinions about a possible standard that we 
ought to converge on. I do not think that my suggestions so far are necessarily 
right or even good in many cases, but I want to get the conversation going. You 
will see that there are already some PRs and issues on the repo, please use 
liberally!
Finally, if this project gathers steam, then the spec should be moved to a more 
prominent and standard location.

Thanks,
MD

—
Martin Durant
martin.dur...@utoronto.ca

Re: Question about streaming to memorymapped files

2018-05-09 Thread Antoine Pitrou


Do you know of any benchmark numbers / performance studies about this?
While it's true that a memory-mapped file avoids explicit system calls,
I've heard file I/O is quite well optimized, at least on Linux,
nowadays.

Regards

Antoine.


On Wed, 9 May 2018 14:47:53 +
"Ambalu, Robert"  wrote:
> Antoine, thanks for the quick reply.
> You can actually grow memorymapped files with a mremap call ( and I think a 
> seek/write on the file ), I do this in my applications and it works fine.
> I want the efficiency of writing via memory maps, so would prefer to avoid 
> FileOutputStream
> 
> -Original Message-
> From: Antoine Pitrou [mailto:anto...@python.org] 
> Sent: Wednesday, May 09, 2018 10:37 AM
> To: dev@arrow.apache.org
> Subject: Re: Question about streaming to memorymapped files
> 
> 
> Hi,
> 
> If you don't know the output size upfront then should probably use a
> FileOutputStream instead.  By definition, memory mapped files must have
> a fixed size (since they are mapped to a fixed area in virtual memory).
> 
> Regards
> 
> Antoine.
> 
> 
> Le 09/05/2018 à 16:31, Ambalu, Robert a écrit :
> > Hey, I'm looking into streaming table updates into a memory mapped file ( 
> > C++ )
> > I think I have everything I need ( MemoryMappedFile output streamer, 
> > RecordBatchStreamWriter ) but I don't understand how to properly create the 
> > memmap file.  It looks like it requires you to preset a size to the file 
> > when you create it, but since ill be streaming I don't actually know how 
> > big a file im going to need...
> > Am I missing some other API point here?  Any reason why size is required up 
> > front and the memmap doesn't auto-grow as needed?
> > 
> > Thanks in advance
> > - Rob
> > 
> > 
> > 
> > 
> > 
> > DISCLAIMER: This e-mail message and any attachments are intended solely for 
> > the use of the individual or entity to which it is addressed and may 
> > contain information that is confidential or legally privileged. If you are 
> > not the intended recipient, you are hereby notified that any dissemination, 
> > distribution, copying or other use of this message or its attachments is 
> > strictly prohibited. If you have received this message in error, please 
> > notify the sender immediately and permanently delete this message and any 
> > attachments.
> > 
> > 
> > 
> >

RE: Question about streaming to memorymapped files

2018-05-09 Thread Ambalu, Robert

Antoine, thanks for the quick reply.
You can actually grow memorymapped files with a mremap call ( and I think a 
seek/write on the file ), I do this in my applications and it works fine.
I want the efficiency of writing via memory maps, so would prefer to avoid 
FileOutputStream

-Original Message-
From: Antoine Pitrou [mailto:anto...@python.org] 
Sent: Wednesday, May 09, 2018 10:37 AM
To: dev@arrow.apache.org
Subject: Re: Question about streaming to memorymapped files


Hi,

If you don't know the output size upfront then should probably use a
FileOutputStream instead.  By definition, memory mapped files must have
a fixed size (since they are mapped to a fixed area in virtual memory).

Regards

Antoine.


Le 09/05/2018 à 16:31, Ambalu, Robert a écrit :
> Hey, I'm looking into streaming table updates into a memory mapped file ( C++ 
> )
> I think I have everything I need ( MemoryMappedFile output streamer, 
> RecordBatchStreamWriter ) but I don't understand how to properly create the 
> memmap file.  It looks like it requires you to preset a size to the file when 
> you create it, but since ill be streaming I don't actually know how big a 
> file im going to need...
> Am I missing some other API point here?  Any reason why size is required up 
> front and the memmap doesn't auto-grow as needed?
> 
> Thanks in advance
> - Rob
> 
> 
> 
> 
> 
> DISCLAIMER: This e-mail message and any attachments are intended solely for 
> the use of the individual or entity to which it is addressed and may contain 
> information that is confidential or legally privileged. If you are not the 
> intended recipient, you are hereby notified that any dissemination, 
> distribution, copying or other use of this message or its attachments is 
> strictly prohibited. If you have received this message in error, please 
> notify the sender immediately and permanently delete this message and any 
> attachments.
> 
> 
> 
>

Re: Question about streaming to memorymapped files

2018-05-09 Thread Antoine Pitrou


Hi,

If you don't know the output size upfront then should probably use a
FileOutputStream instead.  By definition, memory mapped files must have
a fixed size (since they are mapped to a fixed area in virtual memory).

Regards

Antoine.


Le 09/05/2018 à 16:31, Ambalu, Robert a écrit :
> Hey, I'm looking into streaming table updates into a memory mapped file ( C++ 
> )
> I think I have everything I need ( MemoryMappedFile output streamer, 
> RecordBatchStreamWriter ) but I don't understand how to properly create the 
> memmap file.  It looks like it requires you to preset a size to the file when 
> you create it, but since ill be streaming I don't actually know how big a 
> file im going to need...
> Am I missing some other API point here?  Any reason why size is required up 
> front and the memmap doesn't auto-grow as needed?
> 
> Thanks in advance
> - Rob
> 
> 
> 
> 
> 
> DISCLAIMER: This e-mail message and any attachments are intended solely for 
> the use of the individual or entity to which it is addressed and may contain 
> information that is confidential or legally privileged. If you are not the 
> intended recipient, you are hereby notified that any dissemination, 
> distribution, copying or other use of this message or its attachments is 
> strictly prohibited. If you have received this message in error, please 
> notify the sender immediately and permanently delete this message and any 
> attachments.
> 
> 
> 
>

Question about streaming to memorymapped files

2018-05-09 Thread Ambalu, Robert

Hey, I'm looking into streaming table updates into a memory mapped file ( C++ )
I think I have everything I need ( MemoryMappedFile output streamer, 
RecordBatchStreamWriter ) but I don't understand how to properly create the 
memmap file.  It looks like it requires you to preset a size to the file when 
you create it, but since ill be streaming I don't actually know how big a file 
im going to need...
Am I missing some other API point here?  Any reason why size is required up 
front and the memmap doesn't auto-grow as needed?

Thanks in advance
- Rob





DISCLAIMER: This e-mail message and any attachments are intended solely for the 
use of the individual or entity to which it is addressed and may contain 
information that is confidential or legally privileged. If you are not the 
intended recipient, you are hereby notified that any dissemination, 
distribution, copying or other use of this message or its attachments is 
strictly prohibited. If you have received this message in error, please notify 
the sender immediately and permanently delete this message and any attachments.

[jira] [Created] (ARROW-2562) [C++] Upload coverage data to codecov.io

2018-05-09 Thread Antoine Pitrou (JIRA)

Antoine Pitrou created ARROW-2562:
-

 Summary: [C++] Upload coverage data to codecov.io
 Key: ARROW-2562
 URL: https://issues.apache.org/jira/browse/ARROW-2562
 Project: Apache Arrow
  Issue Type: Task
  Components: C++
Reporter: Antoine Pitrou
Assignee: Antoine Pitrou


ARROW-27 (upload coverage data to coveralls.io) has failed moving forward. We 
can try codecov.io instead, another free code coverage hosting service.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-2561) [C++] Crash in cuda-test shutdown with coverage enabled

2018-05-09 Thread Antoine Pitrou (JIRA)

Antoine Pitrou created ARROW-2561:
-

 Summary: [C++] Crash in cuda-test shutdown with coverage enabled
 Key: ARROW-2561
 URL: https://issues.apache.org/jira/browse/ARROW-2561
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++, GPU
Affects Versions: 0.9.0
Reporter: Antoine Pitrou


If I enable both CUDA and code coverage (using 
{{-DARROW_GENERATE_COVERAGE=on}}), {{cuda-test}} sometimes crashes at shutdown 
with the following message:

{code}
*** Error in `./build-test/debug/cuda-test': corrupted size vs. prev_size: 
0x01612bb0 ***
=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7fc3d61e47e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x7e9dc)[0x7fc3d61eb9dc]
/lib/x86_64-linux-gnu/libc.so.6(+0x81cde)[0x7fc3d61eecde]
/lib/x86_64-linux-gnu/libc.so.6(__libc_malloc+0x54)[0x7fc3d61f1184]
/home/antoine/arrow/cpp/build-test/debug/libarrow.so.10(+0x9350f3)[0x7fc3d5a510f3]
/lib/x86_64-linux-gnu/libc.so.6(__cxa_finalize+0x9a)[0x7fc3d61a736a]
/home/antoine/arrow/cpp/build-test/debug/libarrow.so.10(+0x3415e3)[0x7fc3d545d5e3]
{code}

(the CUDA tests themselves pass)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-2560) [Rust] The Rust README should include Rust-specific information on contributing

2018-05-09 Thread Andy Grove (JIRA)

Andy Grove created ARROW-2560:
-

 Summary: [Rust] The Rust README should include Rust-specific 
information on contributing
 Key: ARROW-2560
 URL: https://issues.apache.org/jira/browse/ARROW-2560
 Project: Apache Arrow
  Issue Type: Task
Reporter: Andy Grove
 Fix For: 0.10.0


Every new contributor has their first build fail because they didn't know to 
use cargo fmt.

We should explain this in the Rust README along with any other pertinent 
information specific to Rust contributions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: plasma tutorial which version?

2018-05-09 Thread Antoine Pitrou


Hi Kendall,

Le 09/05/2018 à 03:02, Kendall Willets a écrit :
> This reminds me -- I had similar trouble with the sample code on the C++
> tutorial (
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html),
> which appears to be out of date.  Should I submit a PR for the doc, or open
> a jira, etc.?  I also have a buildable sample.

You should open a JIRA and submit a PR, yes.

Regards

Antoine.


> 
> On Wed, May 2, 2018 at 5:08 AM, Uwe L. Korn  wrote:
> 
>> Hello Ovidiu,
>>
>> we actually would like to keep the tutorial in line with the API we have
>> in master. Can you post the errors you're getting?
>>
>> This would help us in understanding what is outdated in the steps.
>>
>> Uwe
>>
>> On Wed, May 2, 2018, at 11:53 AM, Ovidiu-Cristian MARCU wrote:
>>> Hi,
>>>
>>> Trying to follow the steps in
>>> https://arrow.apache.org/docs/cpp/md_tutorials_plasma.html
>>>  with
>> master
>>> branch does not work.
>>> Could you please provide the arrow version that is compatible with this
>>> tutorial?
>>>
>>> Thanks,
>>> Ovidiu
>>
>

[jira] [Created] (ARROW-2559) [Plasma] delete object notification queue for a client when it disconnects with plasma

2018-05-09 Thread Zhijun Fu (JIRA)

Zhijun Fu created ARROW-2559:


 Summary: [Plasma] delete object notification queue for a client 
when it disconnects with plasma
 Key: ARROW-2559
 URL: https://issues.apache.org/jira/browse/ARROW-2559
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Zhijun Fu






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-2558) [Plasma] avoid walk through all the objects when a client disconnects

2018-05-09 Thread Zhijun Fu (JIRA)

Zhijun Fu created ARROW-2558:


 Summary: [Plasma] avoid walk through all the objects when a client 
disconnects
 Key: ARROW-2558
 URL: https://issues.apache.org/jira/browse/ARROW-2558
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Plasma (C++)
Reporter: Zhijun Fu


Currently plasma stores list-of-clients in ObjectTableEntry, which is used to 
track which clients are using a given object, this serves for two purposes:
- If an object is in use.
- If the client trying to abort an object is the one who created it.

A problem with list-of-clients approach is that when a client disconnects, we 
need to walk through all the objects and remove the client pointer from the 
list for each object.

Instead, we could add a reference count in ObjectTableEntry, and store 
list-of-object-ids in client structure. This could both goals that the original 
approach is targeting, while when a client disconnects, it just walk through 
its object-ids and dereference each ObjectTableEntry, there's no need to walk 
through all objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-2557) [Rust] Add badge for code coverage in README

2018-05-09 Thread Chao Sun (JIRA)

Chao Sun created ARROW-2557:
---

 Summary: [Rust] Add badge for code coverage in README
 Key: ARROW-2557
 URL: https://issues.apache.org/jira/browse/ARROW-2557
 Project: Apache Arrow
  Issue Type: Test
  Components: Rust
Reporter: Chao Sun
Assignee: Chao Sun


Follow up on ARROW-2477, It may be good to add a badge in README to report the 
current code coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-2564) C++ Rowwise Tutorial is out of date

RE: Question about streaming to memorymapped files

Re: Question about streaming to memorymapped files

RE: Question about streaming to memorymapped files

[jira] [Created] (ARROW-2563) [Rust] Poor caching in Travis-CI

file-system specification

Re: Question about streaming to memorymapped files

RE: Question about streaming to memorymapped files

Re: Question about streaming to memorymapped files

Question about streaming to memorymapped files

[jira] [Created] (ARROW-2562) [C++] Upload coverage data to codecov.io

[jira] [Created] (ARROW-2561) [C++] Crash in cuda-test shutdown with coverage enabled

[jira] [Created] (ARROW-2560) [Rust] The Rust README should include Rust-specific information on contributing

Re: plasma tutorial which version?

[jira] [Created] (ARROW-2559) [Plasma] delete object notification queue for a client when it disconnects with plasma

[jira] [Created] (ARROW-2558) [Plasma] avoid walk through all the objects when a client disconnects

[jira] [Created] (ARROW-2557) [Rust] Add badge for code coverage in README

17 matches

Site Navigation

Mail list logo

Footer information