date:20071130

[jira] Commented: (HADOOP-2249) [hbase] Add means of getting the timestamps for all cell versions: e.g. long [] getVersions(row, column)

2007-11-30 Thread Bryan Duxbury (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547389
 ] 

Bryan Duxbury commented on HADOOP-2249:
---

I was hoping for a method that would give you all of the timestamps for a row, 
not per row and column. You'd want to able to do this so that you could know 
what different versions of a particular entity you have in HBase. As such, 
basically all I'm looking for is:

long[] getRowTimestamps(Text row)

> [hbase] Add means of getting the timestamps for all cell versions: e.g. long 
> [] getVersions(row, column)
> 
>
> Key: HADOOP-2249
> URL: https://issues.apache.org/jira/browse/HADOOP-2249
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: stack
>Priority: Minor
> Fix For: 0.16.0
>
>
> Should be means of asking hbase for list of all the timestamps associated 
> with a particular cell.  The brute force way would be adding a getVersions 
> method but perhaps we can come up w/ something more elegant than this?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2314) TestBlockReplacement occasionally get into an infinite loop

2007-11-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HADOOP-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547383
 ] 

Hadoop QA commented on HADOOP-2314:
---

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12370690/block.patch
against trunk revision r600019.

@author +1.  The patch does not contain any @author tags.

javadoc +1.  The javadoc tool did not generate any warning messages.

javac +1.  The applied patch does not generate any new compiler warnings.

findbugs +1.  The patch does not introduce any new Findbugs warnings.

core tests +1.  The patch passed core unit tests.

contrib tests +1.  The patch passed contrib unit tests.

Test results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1226/testReport/
Findbugs warnings: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1226/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1226/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1226/console

This message is automatically generated.

> TestBlockReplacement occasionally get into an infinite loop
> ---
>
> Key: HADOOP-2314
> URL: https://issues.apache.org/jira/browse/HADOOP-2314
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.15.1
>Reporter: Hairong Kuang
>Assignee: Hairong Kuang
> Fix For: 0.16.0
>
> Attachments: block.patch
>
>
> It turns out that in the case that tests an invalid deletion hint, either the 
> newNode or source may be choosen to be deleted as an exessive replica since 
> both of the nodes are on the same rack. The test assumes that only newNode 
> will be deleted and wait for its deletion. This causes an infinite loop when 
> source is chosen to be deleted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2326) Add a Random backoff for the initial block report sent to the Name node

2007-11-30 Thread dhruba borthakur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HADOOP-2326:
-

Attachment: (was: asyncRPC-4.patch)

> Add a Random backoff for the initial block report sent to the Name node
> ---
>
> Key: HADOOP-2326
> URL: https://issues.apache.org/jira/browse/HADOOP-2326
> Project: Hadoop
>  Issue Type: Improvement
>  Components: dfs
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
> Fix For: 0.16.0
>
>
> Startup time can be improved if the initial block reports are spread randomly 
> over small period of time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2326) Add a Random backoff for the initial block report sent to the Name node

2007-11-30 Thread dhruba borthakur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HADOOP-2326:
-

Comment: was deleted

> Add a Random backoff for the initial block report sent to the Name node
> ---
>
> Key: HADOOP-2326
> URL: https://issues.apache.org/jira/browse/HADOOP-2326
> Project: Hadoop
>  Issue Type: Improvement
>  Components: dfs
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
> Fix For: 0.16.0
>
>
> Startup time can be improved if the initial block reports are spread randomly 
> over small period of time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1841) IPC server should write repsonses asynchronously

2007-11-30 Thread dhruba borthakur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HADOOP-1841:
-

Attachment: asyncRPC-4.patch

This patch is in response to Doug's comments that we need to show improvement 
in performance in the face of slow clients.

This patch has a unit test that creates a server with one handler thread. One 
thread makes an RPC and stops processing the response. Another thread then 
issues another RPC and it completes successfully even though the first thread 
has not yet consumed the RPC response. This test passes successfully with this 
patch whereas it fails with trunk.

Please let me know is it addresses your concerns. If so, then the only 
remaining thing to make this patch committable is to demonstrate that it does 
not degrade performance for sort runs.

> IPC server should write repsonses asynchronously
> 
>
> Key: HADOOP-1841
> URL: https://issues.apache.org/jira/browse/HADOOP-1841
> Project: Hadoop
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Doug Cutting
>Assignee: dhruba borthakur
> Attachments: asyncRPC-2.patch, asyncRPC-4.patch, asyncRPC.patch, 
> asyncRPC.patch
>
>
> Hadoop's IPC Server currently writes responses from request handler threads 
> using blocking writes.  Performance and scalability might be improved if 
> responses were written asynchronously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2326) Add a Random backoff for the initial block report sent to the Name node

2007-11-30 Thread dhruba borthakur (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HADOOP-2326:
-

Attachment: asyncRPC-4.patch

This patch is in response to Doug's comments that we need to show improvement 
in performance in the face of slow clients.

This patch has a unit test that creates a server with one handler thread. One 
thread makes an RPC and stops processing the response. Another thread then 
issues another RPC and it completes successfully even though the first thread 
has not yet consumed the RPC response. This test passes successfully with this 
patch whereas it fails with trunk.

Please let me know is it addresses your concerns. If so, then the only 
remaining thing to make this patch committable is to demonstrate that it does 
not degrade performance for sort runs.

> Add a Random backoff for the initial block report sent to the Name node
> ---
>
> Key: HADOOP-2326
> URL: https://issues.apache.org/jira/browse/HADOOP-2326
> Project: Hadoop
>  Issue Type: Improvement
>  Components: dfs
>Reporter: Sanjay Radia
>Assignee: Sanjay Radia
> Fix For: 0.16.0
>
> Attachments: asyncRPC-4.patch
>
>
> Startup time can be improved if the initial block reports are spread randomly 
> over small period of time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Deleted: (HADOOP-1828) [Hbase Shell] Switch Command for sub-shell

2007-11-30 Thread Edward Yoon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Yoon deleted HADOOP-1828:



> [Hbase Shell] Switch Command for sub-shell
> --
>
> Key: HADOOP-1828
> URL: https://issues.apache.org/jira/browse/HADOOP-1828
> Project: Hadoop
>  Issue Type: Improvement
> Environment: All environments
>Reporter: Edward Yoon
>Priority: Minor
>
> This is a swtich command patch for the future implementation of sub-shell.
> {code}
> HBase > altools;
> Hbase altools, 0.0.1 version
> Type 'help;' for Hbase altools usage.
> HBase.Altools > help;
> Type 'help ;' to see command-specific usage.
> * Global commands.
> FSHadoop FsShell operations.
> EXIT  Exit shell
> SHOW  List all tables.
> CLEAR Clear the screen.
> DESCRIBE  Describe a table's columnfamilies.
> * Altools Commands.
> Projection ...
> HBase.Altools > exit;
> HBase > exit;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

A tracing framework for Hadoop

2007-11-30 Thread Matei Zaharia

Hi,

We're grad students at UC Berkeley working on a project to instrument
Hadoop using an open-source path-based tracing framework called X-
Trace (www.x-trace.net/wiki). X-Trace captures causal dependencies
between events in addition to timings, letting developers analyze not
just performance but also context and dependencies for various events.
We have created a web-based trace analysis UI that shows performance
of different IPC calls, DFS operations, and phases of a MapReduce job.
The goal is to let users easily spot the origin of unusual behavior in
a running system at a centralized location. We believe that this kind
of tracing can be used for performance tuning and debugging in both
development and production environments.

We'd like to get feedback on our work and suggestions on what trace
analyses would be useful to Hadoop developers and users. Some of the
reports we currently generate include machine utilization over time,
relative performance of different tasks, and performance of DFS
operations. You can see an example set of reports at http://www.cs.berkeley.edu/~matei/xtrace_sample_task.html
(this is a trace of a Nutch indexing job). You can also read our
project journal at http://radlab.cs.berkeley.edu/wiki/Projects/Monitoring_Hadoop_through_Tracing
. We've already spotted some interesting issues, like map tasks and
DFS reads/writes that are an order of magnitude slower than the
average, and we are investigating possible causes for them. Most
importantly, the UI lets a user easily see where the system is
spending time and reason about how to tune it, and provides much more
information than the progress data in the JobTracker UI. As a Hadoop
developer, what kinds of questions do you want answered about running
jobs that are hard to obtain just from process logs?

Once we've had a discussion on features for a trace analysis UI, we
would like to contribute our work into the Hadoop codebase. We will
create a JIRA issue and patch adding this functionality. We're also
interested in seeing if we can integrate X-Trace logging more tightly
with the current Apache logging in Hadoop.

Finally, we are currently experimenting on relatively small (<50
nodes) clusters here at Berkeley, but we would really like to try
tracing some large (>1000 node) clusters. If there is someone
interested in evaluating performance on such a cluster, we would be
very happy to talk about how to set up X-Trace and provide you with a
patch.

1 2 >

1 - 100 of 134 matches

Mail list logo