[ 
https://issues.apache.org/jira/browse/SINGA-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231526#comment-15231526
 ] 

ASF subversion and git services commented on SINGA-130:
-------------------------------------------------------

Commit a0bdd0b85ddba7d670ab04c5de04a29c8366e868 in incubator-singa's branch 
refs/heads/master from [~ug93tad]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-singa.git;h=a0bdd0b ]

SINGA-130 Data prefetching layer

Extended StoreInputLayer to support prefetching of data. It maintains a buffer 
for (key,value) pairs read from the storage
layer. In Setup(), it launches a new thread for reading data into the buffer. 
This thread stores data into the buffer. The
ComputeFeature() method waits for thread to finish (join) before parsing it 
into data_ and aux_ field. Finally, it launches
another thread.

In terms of memory consumption, this prefetching use extra 
(batchsize*recordsize) bytes for the buffer. However, we observe
no visible runtime improvement, as I/O time is very small (in order of 
milliseconds without prefetching, and tens of microsecond
with prefetching) compared to CPU time.


> Implement a layer subclass for data prefetching
> -----------------------------------------------
>
>                 Key: SINGA-130
>                 URL: https://issues.apache.org/jira/browse/SINGA-130
>             Project: Singa
>          Issue Type: New Feature
>            Reporter: wangwei
>            Assignee: Anh Dinh
>              Labels: data, multi-threading, prefetch
>
> Data prefetching is important for training with GPU, because the IO would 
> become the bottleneck when the computation is very fast.
> One idea is to create a general prefetch layer which embeds the application 
> specific data loading layers. 
> {code}
> PrefetchLayer::ComptueFeature() {
>   wait until the pretch thread finishes.
>   swap the prefeth_data_ and data_ blobs.
>   if (first time)
>      load data into data_ blobs
>   spawn a new thread to call functions from data loading layers for loading 
> data into prefetch_data_.
> }
> {code}
>  
> If the prefetch layer has multiple loading layers and is connected to 
> multiple destination layers, then different destination layer may want data 
> loaded by different loading layers. This case should be handled properly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to