[jira] Created: (HADOOP-2334) [hbase] VOTE: should row keys be less restrictive than hadoop.io.Text?

2007-12-03 Thread Jim Kellerman (JIRA)
[hbase] VOTE: should row keys be less restrictive than hadoop.io.Text?
--

 Key: HADOOP-2334
 URL: https://issues.apache.org/jira/browse/HADOOP-2334
 Project: Hadoop
  Issue Type: Wish
  Components: contrib/hbase
Affects Versions: 0.16.0
Reporter: Jim Kellerman
Assignee: Jim Kellerman
Priority: Minor
 Fix For: 0.16.0


I have heard from several people that row keys in HBase should be less 
restricted than hadoop.io.Text.

What do you think?

At the very least, a row key has to be a WritableComparable. This would lead to 
the most general case being either hadoop.io.BytesWritable or 
hbase.io.ImmutableBytesWritable. The primary difference between these two 
classes is that hadoop.io.BytesWritable by default allocates 100 bytes and if 
you do not pay attention to the length, (BytesWritable.getSize()), converting a 
String to a BytesWritable and vice versa can become problematic. 

hbase.io.ImmutableBytesWritable, in contrast only allocates as many bytes as 
you pass in and then does not allow the size to be changed.

If we were to change from Text to a non-text key, my preference would be for 
ImmutableBytesWritable, because it has a fixed size once set, and operations 
like get, etc do not have to something like System.arrayCopy where you specify 
the number of bytes to copy.

Your comments, questions are welcome on this issue. If we receive enough 
feedback that Text is too restrictive, we are willing to change it, but we need 
to hear what would be the most useful thing to change it to as well.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2331) [hbase] TestScanner2 does not release resources which sometimes cause the test to time out

2007-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547760
 ] 

Hadoop QA commented on HADOOP-2331:
---

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12370811/Hadoop-2331-patch.txt
against trunk revision r600244.

@author +1.  The patch does not contain any @author tags.

javadoc +1.  The javadoc tool did not generate any warning messages.

javac +1.  The applied patch does not generate any new compiler warnings.

findbugs +1.  The patch does not introduce any new Findbugs warnings.

core tests +1.  The patch passed core unit tests.

contrib tests +1.  The patch passed contrib unit tests.

Test results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1241/testReport/
Findbugs warnings: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1241/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1241/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1241/console

This message is automatically generated.

> [hbase] TestScanner2 does not release resources which sometimes cause the 
> test to time out
> --
>
> Key: HADOOP-2331
> URL: https://issues.apache.org/jira/browse/HADOOP-2331
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Jim Kellerman
>Assignee: Jim Kellerman
> Fix For: 0.16.0
>
> Attachments: Hadoop-2331-patch.txt
>
>
> TestScanner2 does not close HTables at the end of each test. This can 
> sometimes make the test take a long time to run or even time out

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2331) [hbase] TestScanner2 does not release resources which sometimes cause the test to time out

2007-12-03 Thread Jim Kellerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Kellerman updated HADOOP-2331:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Tests passed. Committed

> [hbase] TestScanner2 does not release resources which sometimes cause the 
> test to time out
> --
>
> Key: HADOOP-2331
> URL: https://issues.apache.org/jira/browse/HADOOP-2331
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Jim Kellerman
>Assignee: Jim Kellerman
> Fix For: 0.16.0
>
> Attachments: Hadoop-2331-patch.txt
>
>
> TestScanner2 does not close HTables at the end of each test. This can 
> sometimes make the test take a long time to run or even time out

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker

2007-12-03 Thread Amareshwari Sri Ramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sri Ramadasu updated HADOOP-1900:
-

Attachment: patch-1900.txt

> the heartbeat and task event queries interval should be set dynamically by 
> the JobTracker
> -
>
> Key: HADOOP-1900
> URL: https://issues.apache.org/jira/browse/HADOOP-1900
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.0
>
> Attachments: patch-1900.txt, patch-1900.txt, patch-1900.txt, 
> patch-1900.txt, patch-1900.txt, patch-1900.txt
>
>
> The JobTracker should scale the intervals that the TaskTrackers use to 
> contact it dynamically, based on how the busy it is and the size of the 
> cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker

2007-12-03 Thread Amareshwari Sri Ramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sri Ramadasu updated HADOOP-1900:
-

Status: Patch Available  (was: Open)

patch with comments incorporated and tested.

> the heartbeat and task event queries interval should be set dynamically by 
> the JobTracker
> -
>
> Key: HADOOP-1900
> URL: https://issues.apache.org/jira/browse/HADOOP-1900
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.0
>
> Attachments: patch-1900.txt, patch-1900.txt, patch-1900.txt, 
> patch-1900.txt, patch-1900.txt, patch-1900.txt
>
>
> The JobTracker should scale the intervals that the TaskTrackers use to 
> contact it dynamically, based on how the busy it is and the size of the 
> cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2323) JobTracker.close() prints stack traces for exceptions that are not errors

2007-12-03 Thread Jim Kellerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Kellerman updated HADOOP-2323:
--

Status: Open  (was: Patch Available)

> JobTracker.close() prints stack traces for exceptions that are not errors
> -
>
> Key: HADOOP-2323
> URL: https://issues.apache.org/jira/browse/HADOOP-2323
> Project: Hadoop
>  Issue Type: Bug
>  Components: mapred
>Affects Versions: 0.16.0
>Reporter: Jim Kellerman
>Assignee: Jim Kellerman
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: patch.txt, patch.txt, patch.txt
>
>
> JobTracker.close() prints a stack trace for an interrupted exception even 
> though it was the method that interrupted the thread that threw the 
> exception. For example:
> {code}
>   this.expireTrackers.stopTracker();
>   try {
> this.expireTrackersThread.interrupt();
> this.expireTrackersThread.join();
>   } catch (InterruptedException ex) {
> ex.printStackTrace();
>   }
> {code}
> Well of course it is going to catch an InterruptedException after it just 
> interrupted the thread!
> This is *not* an error and should  *not* be dumped to the logs!
> In other circumstances, catching InterruptedException is entirely 
> appropriate. Just not in close where you've told the thread to shutdown and 
> then interrupted it to ensure it does!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-1327) Doc on Streaming

2007-12-03 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547769
 ] 

Owen O'Malley commented on HADOOP-1327:
---

This looks good, but there are some typos:

chararacter -> character
fouth -> forth
 

> Doc on Streaming
> 
>
> Key: HADOOP-1327
> URL: https://issues.apache.org/jira/browse/HADOOP-1327
> Project: Hadoop
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Runping Qi
> Attachments: HADOOP-1327.patch, site.xml, streaming.html, 
> streaming.xml
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2333) [hbase] client side retries happen at the wrong level

2007-12-03 Thread Jim Kellerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Kellerman updated HADOOP-2333:
--

Status: Open  (was: Patch Available)

> [hbase] client side retries happen at the wrong level
> -
>
> Key: HADOOP-2333
> URL: https://issues.apache.org/jira/browse/HADOOP-2333
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Jim Kellerman
>Assignee: Jim Kellerman
> Fix For: 0.16.0
>
>
> Currently client side retries are handled by 
> HConnectionManager$TableServers.scanOneMetaRegion. This is ok for regions 
> that have never been on-line, because they won't be found in the meta table.
> However, for regions that have been on-line and have gone off-line (for a 
> region split for example) there are entries in the meta table for the table 
> being found, but they are incorrect.
> In the latter case, the scan of the meta table succeeded, but the new regions 
> are not yet on-line. If any retries are done, they are done without any wait 
> period. For example:
> {code}
> 2007-12-03 05:57:30,433 INFO  [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegion.needsSplit(HRegion.java:657): Splitting 
> mrtest,,1196661378142 because largest aggregate size is 815.3k and desired 
> size is 256.0k
> 2007-12-03 05:57:30,436 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegionServer$Splitter.closing(HRegionServer.java:217):
>  mrtest,,1196661378142 closing (Adding to retiringRegions)
> 2007-12-03 05:57:30,436 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegion.snapshotMemcaches(HRegion.java:828): Started 
> memcache flush for region mrtest,,1196661378142. Size 35.9k
> 2007-12-03 05:57:30,702 DEBUG [RegionServer:0.cacheFlusher] 
> org.apache.hadoop.hbase.HRegionServer$Flusher.run(HRegionServer.java:449): 
> flushing region -ROOT-,,0
> 2007-12-03 05:57:30,702 DEBUG [RegionServer:0.cacheFlusher] 
> org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:882): Not 
> flushing cache for region -ROOT-,,0: snapshotMemcaches() determined that 
> there was nothing to do
> 2007-12-03 05:57:30,894 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.internalFlushCache(HStore.java:930): Added 
> -1097746468/text/7689205514340304602 with sequence id 30021 and size 42.9k
> 2007-12-03 05:57:30,932 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.internalFlushCache(HStore.java:930): Added 
> -1097746468/contents/6256958480029690533 with sequence id 30021 and size 110.0
> 2007-12-03 05:57:30,933 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:939): 
> Finished memcache flush for region mrtest,,1196661378142 in 497ms, 
> sequenceid=30021
> 2007-12-03 05:57:30,933 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.close(HStore.java:840): closed -1097746468/text
> 2007-12-03 05:57:30,934 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.close(HStore.java:840): closed 
> -1097746468/contents
> 2007-12-03 05:57:30,934 INFO  [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegion.close(HRegion.java:428): closed 
> mrtest,,1196661378142
> 2007-12-03 05:57:30,934 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegionServer$Splitter.closed(HRegionServer.java:231):
>  mrtest,,1196661378142 closed
> 2007-12-03 05:57:31,924 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.(HStore.java:581): starting 
> -730914122/contents (no reconstruction log)
> 2007-12-03 05:57:31,964 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.(HStore.java:617): maximum sequence id 
> for hstore -730914122/contents is 30021
> 2007-12-03 05:57:31,979 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.(HStore.java:581): starting 
> -730914122/text (no reconstruction log)
> 2007-12-03 05:57:31,996 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.(HStore.java:617): maximum sequence id 
> for hstore -730914122/text is 30021
> 2007-12-03 05:57:32,856 INFO  [HMaster.rootScanner] 
> org.apache.hadoop.hbase.HMaster$BaseScanner.scanRegion(HMaster.java:212): 
> HMaster.rootScanner scanning meta region regionname: -ROOT-,,0, startKey: <>, 
> server: 140.211.11.75:47137}
> 2007-12-03 05:57:33,479 WARN  [Task Commit Thread] 
> org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2011):
>  Task Commit Thread exiting...
> 2007-12-03 05:57:33,481 ERROR [IPC Server handler 3 on 47137] 
> org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1378): 
> Error opening scanner (fsOk: true)
> org.apache.hadoop.hbase.NotServingRegionException: mrtest,,1196661378142
>   at 
> org.apache.hadoop.hbase.HRegionServer.getRegion(HReg

[jira] Updated: (HADOOP-2323) JobTracker.close() prints stack traces for exceptions that are not errors

2007-12-03 Thread Jim Kellerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Kellerman updated HADOOP-2323:
--

Status: Patch Available  (was: Open)

> JobTracker.close() prints stack traces for exceptions that are not errors
> -
>
> Key: HADOOP-2323
> URL: https://issues.apache.org/jira/browse/HADOOP-2323
> Project: Hadoop
>  Issue Type: Bug
>  Components: mapred
>Affects Versions: 0.16.0
>Reporter: Jim Kellerman
>Assignee: Jim Kellerman
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: patch.txt, patch.txt, patch.txt
>
>
> JobTracker.close() prints a stack trace for an interrupted exception even 
> though it was the method that interrupted the thread that threw the 
> exception. For example:
> {code}
>   this.expireTrackers.stopTracker();
>   try {
> this.expireTrackersThread.interrupt();
> this.expireTrackersThread.join();
>   } catch (InterruptedException ex) {
> ex.printStackTrace();
>   }
> {code}
> Well of course it is going to catch an InterruptedException after it just 
> interrupted the thread!
> This is *not* an error and should  *not* be dumped to the logs!
> In other circumstances, catching InterruptedException is entirely 
> appropriate. Just not in close where you've told the thread to shutdown and 
> then interrupted it to ensure it does!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2323) JobTracker.close() prints stack traces for exceptions that are not errors

2007-12-03 Thread Jim Kellerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Kellerman updated HADOOP-2323:
--

Status: Patch Available  (was: Open)

> JobTracker.close() prints stack traces for exceptions that are not errors
> -
>
> Key: HADOOP-2323
> URL: https://issues.apache.org/jira/browse/HADOOP-2323
> Project: Hadoop
>  Issue Type: Bug
>  Components: mapred
>Affects Versions: 0.16.0
>Reporter: Jim Kellerman
>Assignee: Jim Kellerman
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: patch.txt, patch.txt, patch.txt
>
>
> JobTracker.close() prints a stack trace for an interrupted exception even 
> though it was the method that interrupted the thread that threw the 
> exception. For example:
> {code}
>   this.expireTrackers.stopTracker();
>   try {
> this.expireTrackersThread.interrupt();
> this.expireTrackersThread.join();
>   } catch (InterruptedException ex) {
> ex.printStackTrace();
>   }
> {code}
> Well of course it is going to catch an InterruptedException after it just 
> interrupted the thread!
> This is *not* an error and should  *not* be dumped to the logs!
> In other circumstances, catching InterruptedException is entirely 
> appropriate. Just not in close where you've told the thread to shutdown and 
> then interrupted it to ensure it does!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2333) [hbase] client side retries happen at the wrong level

2007-12-03 Thread Jim Kellerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Kellerman updated HADOOP-2333:
--

Status: Patch Available  (was: Open)

> [hbase] client side retries happen at the wrong level
> -
>
> Key: HADOOP-2333
> URL: https://issues.apache.org/jira/browse/HADOOP-2333
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Jim Kellerman
>Assignee: Jim Kellerman
> Fix For: 0.16.0
>
>
> Currently client side retries are handled by 
> HConnectionManager$TableServers.scanOneMetaRegion. This is ok for regions 
> that have never been on-line, because they won't be found in the meta table.
> However, for regions that have been on-line and have gone off-line (for a 
> region split for example) there are entries in the meta table for the table 
> being found, but they are incorrect.
> In the latter case, the scan of the meta table succeeded, but the new regions 
> are not yet on-line. If any retries are done, they are done without any wait 
> period. For example:
> {code}
> 2007-12-03 05:57:30,433 INFO  [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegion.needsSplit(HRegion.java:657): Splitting 
> mrtest,,1196661378142 because largest aggregate size is 815.3k and desired 
> size is 256.0k
> 2007-12-03 05:57:30,436 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegionServer$Splitter.closing(HRegionServer.java:217):
>  mrtest,,1196661378142 closing (Adding to retiringRegions)
> 2007-12-03 05:57:30,436 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegion.snapshotMemcaches(HRegion.java:828): Started 
> memcache flush for region mrtest,,1196661378142. Size 35.9k
> 2007-12-03 05:57:30,702 DEBUG [RegionServer:0.cacheFlusher] 
> org.apache.hadoop.hbase.HRegionServer$Flusher.run(HRegionServer.java:449): 
> flushing region -ROOT-,,0
> 2007-12-03 05:57:30,702 DEBUG [RegionServer:0.cacheFlusher] 
> org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:882): Not 
> flushing cache for region -ROOT-,,0: snapshotMemcaches() determined that 
> there was nothing to do
> 2007-12-03 05:57:30,894 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.internalFlushCache(HStore.java:930): Added 
> -1097746468/text/7689205514340304602 with sequence id 30021 and size 42.9k
> 2007-12-03 05:57:30,932 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.internalFlushCache(HStore.java:930): Added 
> -1097746468/contents/6256958480029690533 with sequence id 30021 and size 110.0
> 2007-12-03 05:57:30,933 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegion.internalFlushcache(HRegion.java:939): 
> Finished memcache flush for region mrtest,,1196661378142 in 497ms, 
> sequenceid=30021
> 2007-12-03 05:57:30,933 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.close(HStore.java:840): closed -1097746468/text
> 2007-12-03 05:57:30,934 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.close(HStore.java:840): closed 
> -1097746468/contents
> 2007-12-03 05:57:30,934 INFO  [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegion.close(HRegion.java:428): closed 
> mrtest,,1196661378142
> 2007-12-03 05:57:30,934 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HRegionServer$Splitter.closed(HRegionServer.java:231):
>  mrtest,,1196661378142 closed
> 2007-12-03 05:57:31,924 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.(HStore.java:581): starting 
> -730914122/contents (no reconstruction log)
> 2007-12-03 05:57:31,964 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.(HStore.java:617): maximum sequence id 
> for hstore -730914122/contents is 30021
> 2007-12-03 05:57:31,979 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.(HStore.java:581): starting 
> -730914122/text (no reconstruction log)
> 2007-12-03 05:57:31,996 DEBUG [RegionServer:0.splitter] 
> org.apache.hadoop.hbase.HStore.(HStore.java:617): maximum sequence id 
> for hstore -730914122/text is 30021
> 2007-12-03 05:57:32,856 INFO  [HMaster.rootScanner] 
> org.apache.hadoop.hbase.HMaster$BaseScanner.scanRegion(HMaster.java:212): 
> HMaster.rootScanner scanning meta region regionname: -ROOT-,,0, startKey: <>, 
> server: 140.211.11.75:47137}
> 2007-12-03 05:57:33,479 WARN  [Task Commit Thread] 
> org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2011):
>  Task Commit Thread exiting...
> 2007-12-03 05:57:33,481 ERROR [IPC Server handler 3 on 47137] 
> org.apache.hadoop.hbase.HRegionServer.openScanner(HRegionServer.java:1378): 
> Error opening scanner (fsOk: true)
> org.apache.hadoop.hbase.NotServingRegionException: mrtest,,1196661378142
>   at 
> org.apache.hadoop.hbase.HRegionServer.getRegion(HReg

[jira] Commented: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker

2007-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547776
 ] 

Hadoop QA commented on HADOOP-1900:
---

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12370812/patch-1900.txt
against trunk revision r600244.

@author +1.  The patch does not contain any @author tags.

javadoc +1.  The javadoc tool did not generate any warning messages.

javac +1.  The applied patch does not generate any new compiler warnings.

findbugs +1.  The patch does not introduce any new Findbugs warnings.

core tests +1.  The patch passed core unit tests.

contrib tests +1.  The patch passed contrib unit tests.

Test results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1242/testReport/
Findbugs warnings: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1242/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1242/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1242/console

This message is automatically generated.

> the heartbeat and task event queries interval should be set dynamically by 
> the JobTracker
> -
>
> Key: HADOOP-1900
> URL: https://issues.apache.org/jira/browse/HADOOP-1900
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.0
>
> Attachments: patch-1900.txt, patch-1900.txt, patch-1900.txt, 
> patch-1900.txt, patch-1900.txt, patch-1900.txt
>
>
> The JobTracker should scale the intervals that the TaskTrackers use to 
> contact it dynamically, based on how the busy it is and the size of the 
> cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2298) ant target without source and docs

2007-12-03 Thread Gautam Kowshik (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547793
 ] 

Gautam Kowshik commented on HADOOP-2298:


>We could provide an ant target that builds a binary tarball, but still release 
>the compound tarball. Folks could then run the binary-only target from within 
>a release if they want to build a binary-only tarball. Might that suffice?

works for me. 

> ant target without source and docs 
> ---
>
> Key: HADOOP-2298
> URL: https://issues.apache.org/jira/browse/HADOOP-2298
> Project: Hadoop
>  Issue Type: Improvement
>  Components: build
>Reporter: Gautam Kowshik
>
> Can we have an ant target or a -D option to build the hadoop tar without the 
> source and documentation? This brings down the tar size from 11.5 MB to 5.6 
> MB. This would speed up distribution. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2323) JobTracker.close() prints stack traces for exceptions that are not errors

2007-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547787
 ] 

Hadoop QA commented on HADOOP-2323:
---

+1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12370806/patch.txt
against trunk revision r600443.

@author +1.  The patch does not contain any @author tags.

javadoc +1.  The javadoc tool did not generate any warning messages.

javac +1.  The applied patch does not generate any new compiler warnings.

findbugs +1.  The patch does not introduce any new Findbugs warnings.

core tests +1.  The patch passed core unit tests.

contrib tests +1.  The patch passed contrib unit tests.

Test results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1243/testReport/
Findbugs warnings: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1243/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1243/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1243/console

This message is automatically generated.

> JobTracker.close() prints stack traces for exceptions that are not errors
> -
>
> Key: HADOOP-2323
> URL: https://issues.apache.org/jira/browse/HADOOP-2323
> Project: Hadoop
>  Issue Type: Bug
>  Components: mapred
>Affects Versions: 0.16.0
>Reporter: Jim Kellerman
>Assignee: Jim Kellerman
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: patch.txt, patch.txt, patch.txt
>
>
> JobTracker.close() prints a stack trace for an interrupted exception even 
> though it was the method that interrupted the thread that threw the 
> exception. For example:
> {code}
>   this.expireTrackers.stopTracker();
>   try {
> this.expireTrackersThread.interrupt();
> this.expireTrackersThread.join();
>   } catch (InterruptedException ex) {
> ex.printStackTrace();
>   }
> {code}
> Well of course it is going to catch an InterruptedException after it just 
> interrupted the thread!
> This is *not* an error and should  *not* be dumped to the logs!
> In other circumstances, catching InterruptedException is entirely 
> appropriate. Just not in close where you've told the thread to shutdown and 
> then interrupted it to ensure it does!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker

2007-12-03 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547797
 ] 

Devaraj Das commented on HADOOP-1900:
-

Sorry one more comment - "if (diff > minWait) " condition should check for 
greater-than-or-equal-to.


> the heartbeat and task event queries interval should be set dynamically by 
> the JobTracker
> -
>
> Key: HADOOP-1900
> URL: https://issues.apache.org/jira/browse/HADOOP-1900
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.0
>
> Attachments: patch-1900.txt, patch-1900.txt, patch-1900.txt, 
> patch-1900.txt, patch-1900.txt, patch-1900.txt
>
>
> The JobTracker should scale the intervals that the TaskTrackers use to 
> contact it dynamically, based on how the busy it is and the size of the 
> cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker

2007-12-03 Thread Amareshwari Sri Ramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sri Ramadasu updated HADOOP-1900:
-

Status: Patch Available  (was: Open)

> the heartbeat and task event queries interval should be set dynamically by 
> the JobTracker
> -
>
> Key: HADOOP-1900
> URL: https://issues.apache.org/jira/browse/HADOOP-1900
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.0
>
> Attachments: patch-1900.txt, patch-1900.txt, patch-1900.txt, 
> patch-1900.txt, patch-1900.txt, patch-1900.txt, patch-1900.txt
>
>
> The JobTracker should scale the intervals that the TaskTrackers use to 
> contact it dynamically, based on how the busy it is and the size of the 
> cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker

2007-12-03 Thread Amareshwari Sri Ramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547802
 ] 

Amareshwari Sri Ramadasu commented on HADOOP-1900:
--

bq. "if (diff > minWait) " condition should check for greater-than-or-equal-to
changed

> the heartbeat and task event queries interval should be set dynamically by 
> the JobTracker
> -
>
> Key: HADOOP-1900
> URL: https://issues.apache.org/jira/browse/HADOOP-1900
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.0
>
> Attachments: patch-1900.txt, patch-1900.txt, patch-1900.txt, 
> patch-1900.txt, patch-1900.txt, patch-1900.txt, patch-1900.txt
>
>
> The JobTracker should scale the intervals that the TaskTrackers use to 
> contact it dynamically, based on how the busy it is and the size of the 
> cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2300) mapred.tasktracker.tasks.maximum is completely ignored

2007-12-03 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HADOOP-2300:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amareshwari!

> mapred.tasktracker.tasks.maximum is completely ignored
> --
>
> Key: HADOOP-2300
> URL: https://issues.apache.org/jira/browse/HADOOP-2300
> Project: Hadoop
>  Issue Type: Bug
>  Components: mapred
>Affects Versions: 0.16.0
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sri Ramadasu
>Priority: Blocker
> Fix For: 0.16.0
>
> Attachments: patch-2300.txt
>
>
> HADOOP-1274 replaced the configuration attribute 
> mapred.tasktracker.tasks.maximum with mapred.tasktracker.map.tasks.maximum 
> and mapred.tasktracker.reduce.tasks.maximum and claims to use the deprecated 
> mapred.tasktracker.tasks.maximum. However, because the new attributes are in 
> hadoop-default.xml, the check to use the deprecated value will never trigger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker

2007-12-03 Thread Amareshwari Sri Ramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sri Ramadasu updated HADOOP-1900:
-

Attachment: patch-1900.txt

> the heartbeat and task event queries interval should be set dynamically by 
> the JobTracker
> -
>
> Key: HADOOP-1900
> URL: https://issues.apache.org/jira/browse/HADOOP-1900
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.0
>
> Attachments: patch-1900.txt, patch-1900.txt, patch-1900.txt, 
> patch-1900.txt, patch-1900.txt, patch-1900.txt, patch-1900.txt
>
>
> The JobTracker should scale the intervals that the TaskTrackers use to 
> contact it dynamically, based on how the busy it is and the size of the 
> cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker

2007-12-03 Thread Amareshwari Sri Ramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sri Ramadasu updated HADOOP-1900:
-

Status: Open  (was: Patch Available)

> the heartbeat and task event queries interval should be set dynamically by 
> the JobTracker
> -
>
> Key: HADOOP-1900
> URL: https://issues.apache.org/jira/browse/HADOOP-1900
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.0
>
> Attachments: patch-1900.txt, patch-1900.txt, patch-1900.txt, 
> patch-1900.txt, patch-1900.txt, patch-1900.txt, patch-1900.txt
>
>
> The JobTracker should scale the intervals that the TaskTrackers use to 
> contact it dynamically, based on how the busy it is and the size of the 
> cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-1900) the heartbeat and task event queries interval should be set dynamically by the JobTracker

2007-12-03 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HADOOP-1900:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amareshwari!

> the heartbeat and task event queries interval should be set dynamically by 
> the JobTracker
> -
>
> Key: HADOOP-1900
> URL: https://issues.apache.org/jira/browse/HADOOP-1900
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Reporter: Owen O'Malley
>Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.0
>
> Attachments: patch-1900.txt, patch-1900.txt, patch-1900.txt, 
> patch-1900.txt, patch-1900.txt, patch-1900.txt, patch-1900.txt
>
>
> The JobTracker should scale the intervals that the TaskTrackers use to 
> contact it dynamically, based on how the busy it is and the size of the 
> cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2321) Streaming: better support for command lines or streaming command

2007-12-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547810
 ] 

Hudson commented on HADOOP-2321:


Integrated in Hadoop-Nightly #321 (See 
[http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/321/])

> Streaming: better support for command lines or streaming command
> 
>
> Key: HADOOP-2321
> URL: https://issues.apache.org/jira/browse/HADOOP-2321
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/streaming
>Reporter: arkady borkovsky
>
> Quite often, the command line for streaming mapper or reducer needs to use 
> one or two levels of quotes.
> This make it inconvenient or impossible to pass the commands in the streaming 
> command line.
> It would be good to have streaming take its specification from a file -- 
> especially as longer streaming commands are not typed in, but are either run 
> from files (shell scripts) or generated by other processors.
> The current work around is to separate files for the mapper command, for the 
> reducer command, and for the streaming command itself.  This works, but is 
> inconvenient and quite error-prone.
> Having just one file with all three would be good.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2268) JobControl classes should use interfaces rather than implemenations

2007-12-03 Thread Adrian Woodhead (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547851
 ] 

Adrian Woodhead commented on HADOOP-2268:
-

Understood.

I personally don't like the idea of dropping the "get" from those method names, 
it seems much clearer to have that in the name. "readyJobs" could also mean 
"get the jobs ready" which is confusing. 

Another option would be to leave those 3 methods returning ArrayList and I 
modify the patch to just change the internal hashtables -> Maps and do the 
deprecating for dependingJobs with the method renamed like discussed earlier. 
The code will be better than it was but you will still have the implementation 
"leak" on those 3 methods.

> JobControl classes should use interfaces rather than implemenations
> ---
>
> Key: HADOOP-2268
> URL: https://issues.apache.org/jira/browse/HADOOP-2268
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Affects Versions: 0.15.0
>Reporter: Adrian Woodhead
>Assignee: Adrian Woodhead
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: HADOOP-2268-1.patch, HADOOP-2268-2.patch
>
>
> See HADOOP-2202 for background on this issue. Arun C. Murthy agrees that when 
> possible it is preferable to program against the interface rather than a 
> concrete implementation (more flexible, allows for changes of the 
> implementation in future etc.) JobControl currently exposes running, waiting, 
> ready, successful and dependent jobs as ArrayList rather than List. I propose 
> to change this to List.
> I will code up a patch for this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2315) [hbase] REST servlet doesn't treat / characters in row key correctly

2007-12-03 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HADOOP-2315:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committted.  Resolving.

> [hbase] REST servlet doesn't treat / characters in row key correctly
> 
>
> Key: HADOOP-2315
> URL: https://issues.apache.org/jira/browse/HADOOP-2315
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Assignee: stack
>Priority: Trivial
> Fix For: 0.16.0
>
> Attachments: fix-urlencode-keys-v2.patch, fix-urlencode-keys.patch
>
>
> Using row keys like "com.site.www/:http" currently doesn't work. We've 
> tracked it down to the use of request.getPathInfo() instead of 
> request.getRequestURI() in Dispatcher.getPathSegments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2068) [hbase] RESTful interface

2007-12-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547908
 ] 

stack commented on HADOOP-2068:
---

You are right that the code and spec. are out of sync Billy.  Inside in the 
putRowXml, there is following code on successful put around line #324 of the 
putRowXml:

{code}
  // respond with a 200
  response.setStatus(200);  
{code}

Should be 201.

Its odd that its returning success but nothing is added. Looking at code, that 
shouldn't be possible.

Keep on asking questions (and finding bugs).  Thanks Billy.

> [hbase] RESTful interface
> -
>
> Key: HADOOP-2068
> URL: https://issues.apache.org/jira/browse/HADOOP-2068
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: stack
>Assignee: Bryan Duxbury
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: rest-11-27-07-v2.patch, rest-11-27-07.3.patc, 
> rest-11-27-07.patch, rest-11-28-07.2.patch, rest-11-28-07.3.patch, 
> rest-11-28-07.patch, rest.patch
>
>
> A RESTful interface would be one means of making hbase accessible to clients 
> that are not java.  It might look something like the below:
> + An HTTP GET of  http://MASTER:PORT/ outputs the master's attributes: online 
> meta regions, list of tables, etc.: i.e. what you see now when you go to 
> http://MASTER:PORT/master.jsp.
> + An HTTP GET of http://MASTER:PORT/TABLENAME: 200 if tables exists and 
> HTableDescription (mimetype: text/plain or text/xml) or 401 if no such table. 
>  HTTP DELETE would drop the table.  HTTP PUT would add one.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW: 200 if row exists and 401 
> if not.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNFAMILY: 
> HColumnDescriptor (mimetype: text/plain or text/xml) or 401 if no such table.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/: 200 and latest 
> version (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE 
> would delete the cell.  HTTP PUT would add a new version.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/TIMESTAMP: 200 
> (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE would 
> remove.  HTTP PUT would put this record.
> + Browser originally goes against master but master then redirects to the 
> hosting region server to serve, update, delete, etc. the addressed cell

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2316) [hbase] Run REST servlet outside of master

2007-12-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547919
 ] 

stack commented on HADOOP-2316:
---

Few comment B:

+ Would suggest that you add support for '-h|--help' and fail if folks provide 
args. that you don't handle outputting a usage message that includes defaults 
for port and address
+ Are the methods getWebAppsPath and getWebAppDir copied from InfoServer?  If 
so, should we make it so this code refers to them over there?  What would need 
to be done?

Otherwise patch looks good Bryan.


> [hbase] Run REST servlet outside of master
> --
>
> Key: HADOOP-2316
> URL: https://issues.apache.org/jira/browse/HADOOP-2316
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Assignee: Bryan Duxbury
>Priority: Minor
> Attachments: rest-external.patch
>
>
> In order to support desired deployment strategy, we need to be able to run 
> the REST servlet independently of the master info server. We should add an 
> new option to the bin/hbase command ("rest"?) that optionally takes a port 
> and bind address and starts the servlet outside of any other project.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HADOOP-1118) NullPointerException in DistributedFileSystem$RawDistributedFileSystem.reportChecksumFailure

2007-12-03 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang resolved HADOOP-1118.
---

Resolution: Fixed

no longer a problem since the crc patch was in.

> NullPointerException in 
> DistributedFileSystem$RawDistributedFileSystem.reportChecksumFailure
> 
>
> Key: HADOOP-1118
> URL: https://issues.apache.org/jira/browse/HADOOP-1118
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.12.0
>Reporter: Nigel Daley
>Assignee: Hairong Kuang
> Attachments: NPEChecksum.patch, NPEChecksum.patch, NPEChecksum1.patch
>
>
> I saw one NullPointerException in the JT log during a large sort run:
> 2007-03-14 08:36:10,210 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
> from task_0002_m_037315_1: java.lang.NullPointerException
> at 
> org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.reportChecksumFailure(DistributedFileSystem.java:326)
> at 
> org.apache.hadoop.dfs.DistributedFileSystem.reportChecksumFailure(DistributedFileSystem.java:405)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.verifySum(ChecksumFileSystem.java:253)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(ChecksumFileSystem.java:211)
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(ChecksumFileSystem.java:167)
> at 
> org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:41)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> at java.io.DataInputStream.readFully(DataInputStream.java:178)
> at java.io.DataInputStream.readFully(DataInputStream.java:152)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.sync(SequenceFile.java:1712)
> at 
> org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:45)
> at 
> org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:55)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:139)
> at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1445)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2316) [hbase] Run REST servlet outside of master

2007-12-03 Thread Bryan Duxbury (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Duxbury updated HADOOP-2316:
--

Attachment: rest-external.patch

Here's a preliminary shot at it.

> [hbase] Run REST servlet outside of master
> --
>
> Key: HADOOP-2316
> URL: https://issues.apache.org/jira/browse/HADOOP-2316
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Assignee: Bryan Duxbury
>Priority: Minor
> Attachments: rest-external.patch
>
>
> In order to support desired deployment strategy, we need to be able to run 
> the REST servlet independently of the master info server. We should add an 
> new option to the bin/hbase command ("rest"?) that optionally takes a port 
> and bind address and starts the servlet outside of any other project.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-1327) Doc on Streaming

2007-12-03 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547928
 ] 

Owen O'Malley commented on HADOOP-1327:
---

When I run forrest on this patch, it fails when processing the new streaming 
document.

> Doc on Streaming
> 
>
> Key: HADOOP-1327
> URL: https://issues.apache.org/jira/browse/HADOOP-1327
> Project: Hadoop
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Runping Qi
> Attachments: HADOOP-1327.patch, site.xml, streaming.html, 
> streaming.xml
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-2335) Streaming: when a job is killed, the message should say it was "killed" rather than "failed"

2007-12-03 Thread arkady borkovsky (JIRA)
Streaming: when a job is killed, the message should say it was "killed" rather 
than "failed"


 Key: HADOOP-2335
 URL: https://issues.apache.org/jira/browse/HADOOP-2335
 Project: Hadoop
  Issue Type: Improvement
  Components: contrib/streaming
Reporter: arkady borkovsky




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2332) [Hbase Shell] Meta table data selection in Hbase Shell

2007-12-03 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HADOOP-2332:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed. Resolving.  Thanks for the fix Edward.

> [Hbase Shell] Meta table data selection in Hbase Shell
> --
>
> Key: HADOOP-2332
> URL: https://issues.apache.org/jira/browse/HADOOP-2332
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Edward Yoon
>Assignee: Edward Yoon
> Fix For: 0.16.0
>
> Attachments: 2332.patch
>
>
> {code}
> I tried select * from .META.; but no luck from shell.
> Billy.
> {code}
> It have some bug in Parser.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2332) [Hbase Shell] Meta table data selection in Hbase Shell

2007-12-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547933
 ] 

stack commented on HADOOP-2332:
---

Oh, I tried it... and yeah, it fixes not being able to select from catalog 
tables -ROOT- and .META.;

> [Hbase Shell] Meta table data selection in Hbase Shell
> --
>
> Key: HADOOP-2332
> URL: https://issues.apache.org/jira/browse/HADOOP-2332
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Edward Yoon
>Assignee: Edward Yoon
> Fix For: 0.16.0
>
> Attachments: 2332.patch
>
>
> {code}
> I tried select * from .META.; but no luck from shell.
> Billy.
> {code}
> It have some bug in Parser.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-1327) Doc on Streaming

2007-12-03 Thread Rob Weltman (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547935
 ] 

Rob Weltman commented on HADOOP-1327:
-

   You have to use Java 1.5 (or turn off validation in forrest.properties). 
This is a known forrest bug.

Rob




> Doc on Streaming
> 
>
> Key: HADOOP-1327
> URL: https://issues.apache.org/jira/browse/HADOOP-1327
> Project: Hadoop
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Runping Qi
> Attachments: HADOOP-1327.patch, site.xml, streaming.html, 
> streaming.xml
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-496) Expose HDFS as a WebDAV store

2007-12-03 Thread Anurag Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547938
 ] 

Anurag Sharma commented on HADOOP-496:
--

hi Owen, ok, will move fuse-j-hadoop to the HADOOP-4 jira.  Thanks for the info.

> Expose HDFS as a WebDAV store
> -
>
> Key: HADOOP-496
> URL: https://issues.apache.org/jira/browse/HADOOP-496
> Project: Hadoop
>  Issue Type: New Feature
>  Components: dfs
>Reporter: Michel Tourn
>Assignee: Enis Soztutar
> Attachments: fuse-j-hadoopfs-0.zip, fuse-j-patch.zip, 
> hadoop-496-3.patch, hadoop-496-4.patch, hadoop-496-spool-cleanup.patch, 
> hadoop-webdav.zip, jetty-slide.xml, lib.webdav.tar.gz, screenshot-1.jpg, 
> slideusers.properties, webdav_wip1.patch, webdav_wip2.patch
>
>
> WebDAV stands for Distributed Authoring and Versioning. It is a set of 
> extensions to the HTTP protocol that lets users collaboratively edit and 
> manage files on a remote web server. It is often considered as a replacement 
> for NFS or SAMBA
> HDFS (Hadoop Distributed File System) needs a friendly file system interface. 
> DFSShell commands are unfamiliar. Instead it is more convenient for Hadoop 
> users to use a mountable network drive. A friendly interface to HDFS will be 
> used both for casual browsing of data and for bulk import/export. 
> The FUSE provider for HDFS is already available ( 
> http://issues.apache.org/jira/browse/HADOOP-17 )  but it had scalability 
> problems. WebDAV is a popular alternative. 
> The typical licensing terms for WebDAV tools are also attractive: 
> GPL for Linux client tools that Hadoop would not redistribute anyway. 
> More importantly, Apache Project/Apache license for Java tools and for server 
> components. 
> This allows for a tighter integration with the HDFS code base.
> There are some interesting Apache projects that support WebDAV.
> But these are probably too heavyweight for the needs of Hadoop:
> Tomcat servlet: 
> http://tomcat.apache.org/tomcat-4.1-doc/catalina/docs/api/org/apache/catalina/servlets/WebdavServlet.html
> Slide:  http://jakarta.apache.org/slide/
> Being HTTP-based and "backwards-compatible" with Web Browser clients, the 
> WebDAV server protocol could even be piggy-backed on the existing Web UI 
> ports of the Hadoop name node / data nodes. WebDAV can be hosted as (Jetty) 
> servlets. This minimizes server code bloat and this avoids additional network 
> traffic between HDFS and the WebDAV server.
> General Clients (read-only):
> Any web browser
> Linux Clients: 
> Mountable GPL davfs2  http://dav.sourceforge.net/
> FTP-like  GPL Cadaver http://www.webdav.org/cadaver/
> Server Protocol compliance tests:
> http://www.webdav.org/neon/litmus/  
> A goal is for Hadoop HDFS to pass this test (minus support for Properties)
> Pure Java clients:
> DAV Explorer Apache lic. http://www.ics.uci.edu/~webdav/  
> WebDAV also makes it convenient to add advanced features in an incremental 
> fashion:
> file locking, access control lists, hard links, symbolic links.
> New WebDAV standards get accepted and more or less featured WebDAV clients 
> exist.
> core  http://www.webdav.org/specs/rfc2518.html
> ACLs  http://www.webdav.org/specs/rfc3744.html
> redirects "soft links" http://greenbytes.de/tech/webdav/rfc4437.html
> BIND "hard links" http://www.webdav.org/bind/
> quota http://tools.ietf.org/html/rfc4331

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2229) Provide a simple login implementation

2007-12-03 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HADOOP-2229:
--

Status: Open  (was: Patch Available)

This looks pretty good. You need to fix the findbugs warnings and I'd make all 
of the unix commands more standard (use whoami and groups) and make the 
invoking path strings relative paths. Making them absolute makes it much more 
likely to be wrong on non-linux platforms.

> Provide a simple login implementation
> -
>
> Key: HADOOP-2229
> URL: https://issues.apache.org/jira/browse/HADOOP-2229
> Project: Hadoop
>  Issue Type: New Feature
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Hairong Kuang
> Fix For: 0.16.0
>
> Attachments: ugi.patch, ugi1.patch, ugi2.patch, ugi3.patch, 
> ugi4.patch, ugi5.patch
>
>
> Give a simple implementation of HADOOP-1701.Hadoop clients are assumed to 
> be started within a Unix-like network which provides user and group 
> management.  This implementation read user information from the OS and send 
> them to the NameNode in plaintexts through RPC (see also HADOOP-2184).  
> NameNode trusts all information given and uses them for permission checking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2316) [hbase] Run REST servlet outside of master

2007-12-03 Thread Bryan Duxbury (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Duxbury updated HADOOP-2316:
--

Attachment: rest-external-v2.patch

Improvements as suggested by Stack.

> [hbase] Run REST servlet outside of master
> --
>
> Key: HADOOP-2316
> URL: https://issues.apache.org/jira/browse/HADOOP-2316
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Assignee: Bryan Duxbury
>Priority: Minor
> Attachments: rest-external-v2.patch, rest-external.patch
>
>
> In order to support desired deployment strategy, we need to be able to run 
> the REST servlet independently of the master info server. We should add an 
> new option to the bin/hbase command ("rest"?) that optionally takes a port 
> and bind address and starts the servlet outside of any other project.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2316) [hbase] Run REST servlet outside of master

2007-12-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547954
 ] 

stack commented on HADOOP-2316:
---

+1

Patch looks good. Tried it.  Little rest server comes up nicely.  Was able to 
do basic GETs.

> [hbase] Run REST servlet outside of master
> --
>
> Key: HADOOP-2316
> URL: https://issues.apache.org/jira/browse/HADOOP-2316
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Assignee: Bryan Duxbury
>Priority: Minor
> Attachments: rest-external-v2.patch, rest-external.patch
>
>
> In order to support desired deployment strategy, we need to be able to run 
> the REST servlet independently of the master info server. We should add an 
> new option to the bin/hbase command ("rest"?) that optionally takes a port 
> and bind address and starts the servlet outside of any other project.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-4) tool to mount dfs on linux

2007-12-03 Thread Anurag Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-4?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547964
 ] 

Anurag Sharma commented on HADOOP-4:


Hello,
We posted this on HADOOP-496 and were pointed to this jira entry as a better 
place to post this patch.  Pasting our original submission message below...

--
Hi,

We revived the old fuse-hadoop project (a FUSE-J based plugin that lets you 
mount Hadoop-FS). We have tried this on a small cluster (10 nodes) and basic 
functionality works (mount, ls, cat,cp, mkdir, rm, mv, ...).

The main changes include some bug fixes to FUSE-J and changing the previous 
fuse-hadoop implementation to enforce write-once. We found the FUSE framework 
to be straightforward and simple.

We have seen several mentions of using FUSE with Hadoop, so if there is a 
better place to post these files, please let me know.

Attachments to follow...

-thanks
--

Attachments include the following:
  * fuse-j-hadoop package
  * fuse-j patch.


> tool to mount dfs on linux
> --
>
> Key: HADOOP-4
> URL: https://issues.apache.org/jira/browse/HADOOP-4
> Project: Hadoop
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 0.5.0
> Environment: linux only
>Reporter: John Xing
>Assignee: Doug Cutting
> Attachments: fuse-hadoop-0.1.0_fuse-j.2.2.3_hadoop.0.5.0.tar.gz, 
> fuse-hadoop-0.1.0_fuse-j.2.4_hadoop.0.5.0.tar.gz, fuse-hadoop-0.1.1.tar.gz, 
> fuse-j-hadoopfs-0.1.zip, fuse-j-patch.zip
>
>
> tool to mount dfs on linux

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-4) tool to mount dfs on linux

2007-12-03 Thread Anurag Sharma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-4?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Sharma updated HADOOP-4:
---

Attachment: fuse-j-patch.zip
fuse-j-hadoopfs-0.1.zip

> tool to mount dfs on linux
> --
>
> Key: HADOOP-4
> URL: https://issues.apache.org/jira/browse/HADOOP-4
> Project: Hadoop
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 0.5.0
> Environment: linux only
>Reporter: John Xing
>Assignee: Doug Cutting
> Attachments: fuse-hadoop-0.1.0_fuse-j.2.2.3_hadoop.0.5.0.tar.gz, 
> fuse-hadoop-0.1.0_fuse-j.2.4_hadoop.0.5.0.tar.gz, fuse-hadoop-0.1.1.tar.gz, 
> fuse-j-hadoopfs-0.1.zip, fuse-j-patch.zip
>
>
> tool to mount dfs on linux

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2068) [hbase] RESTful interface

2007-12-03 Thread Bryan Duxbury (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547968
 ] 

Bryan Duxbury commented on HADOOP-2068:
---

Now that I think about it, I don't think that put/post should return a 201, 
because we're not always creating a new resource, in the HTTP sense. We should 
just change the spec to  say 200.

> [hbase] RESTful interface
> -
>
> Key: HADOOP-2068
> URL: https://issues.apache.org/jira/browse/HADOOP-2068
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: stack
>Assignee: Bryan Duxbury
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: rest-11-27-07-v2.patch, rest-11-27-07.3.patc, 
> rest-11-27-07.patch, rest-11-28-07.2.patch, rest-11-28-07.3.patch, 
> rest-11-28-07.patch, rest.patch
>
>
> A RESTful interface would be one means of making hbase accessible to clients 
> that are not java.  It might look something like the below:
> + An HTTP GET of  http://MASTER:PORT/ outputs the master's attributes: online 
> meta regions, list of tables, etc.: i.e. what you see now when you go to 
> http://MASTER:PORT/master.jsp.
> + An HTTP GET of http://MASTER:PORT/TABLENAME: 200 if tables exists and 
> HTableDescription (mimetype: text/plain or text/xml) or 401 if no such table. 
>  HTTP DELETE would drop the table.  HTTP PUT would add one.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW: 200 if row exists and 401 
> if not.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNFAMILY: 
> HColumnDescriptor (mimetype: text/plain or text/xml) or 401 if no such table.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/: 200 and latest 
> version (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE 
> would delete the cell.  HTTP PUT would add a new version.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/TIMESTAMP: 200 
> (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE would 
> remove.  HTTP PUT would put this record.
> + Browser originally goes against master but master then redirects to the 
> hosting region server to serve, update, delete, etc. the addressed cell

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2068) [hbase] RESTful interface

2007-12-03 Thread Billy Pearson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547969
 ] 

Billy Pearson commented on HADOOP-2068:
---

After lots of work trying to get the php curl to work with the post option. 
Looks like we will need support for content type "x-www-form-urlencoded". I 
have tried many ways to get curl to encode the data and send as text/xml but 
using the post fields option in php curl sends the data with content type 
application/x-www-form-urlencoded. Sense that is not a supported content type I 
get back a http error
406 Unsupported Accept Header Content: application/x-www-form-urlencoded

So is there a way we can add support for that?

If so we could still use the xml format as the data put make sure to urldecode 
it first.
also can we have a set filed to post the xml to that the app knows about like 
"xmldata" or something like that for post

So we can set the post field:

Example 
xmldata='   a:
YQ==   ';

Then the app knows where to look for the xml data.


> [hbase] RESTful interface
> -
>
> Key: HADOOP-2068
> URL: https://issues.apache.org/jira/browse/HADOOP-2068
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: stack
>Assignee: Bryan Duxbury
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: rest-11-27-07-v2.patch, rest-11-27-07.3.patc, 
> rest-11-27-07.patch, rest-11-28-07.2.patch, rest-11-28-07.3.patch, 
> rest-11-28-07.patch, rest.patch
>
>
> A RESTful interface would be one means of making hbase accessible to clients 
> that are not java.  It might look something like the below:
> + An HTTP GET of  http://MASTER:PORT/ outputs the master's attributes: online 
> meta regions, list of tables, etc.: i.e. what you see now when you go to 
> http://MASTER:PORT/master.jsp.
> + An HTTP GET of http://MASTER:PORT/TABLENAME: 200 if tables exists and 
> HTableDescription (mimetype: text/plain or text/xml) or 401 if no such table. 
>  HTTP DELETE would drop the table.  HTTP PUT would add one.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW: 200 if row exists and 401 
> if not.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNFAMILY: 
> HColumnDescriptor (mimetype: text/plain or text/xml) or 401 if no such table.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/: 200 and latest 
> version (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE 
> would delete the cell.  HTTP PUT would add a new version.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/TIMESTAMP: 200 
> (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE would 
> remove.  HTTP PUT would put this record.
> + Browser originally goes against master but master then redirects to the 
> hosting region server to serve, update, delete, etc. the addressed cell

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2068) [hbase] RESTful interface

2007-12-03 Thread Bryan Duxbury (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547978
 ] 

Bryan Duxbury commented on HADOOP-2068:
---

I am unfamiliar with php's HTTP library. However, I don't think we  
should change the content-type accepted for XML formatted data. An  
XML entity body is definitely NOT x-www-form-urlencoded. Hacking it  
to work around that essentially breaks the HTTP spec.

I would suggest finding out if you can get lower-level access to the  
HTTP session than what php's library is giving you. This is not  
incredibly complicated functionality.

As a last resort, I invite you to submit a patch that can read the  
xml out of the postdata when encoded as x-www-form-urlencoded and  
we'll find a way to work it in.





> [hbase] RESTful interface
> -
>
> Key: HADOOP-2068
> URL: https://issues.apache.org/jira/browse/HADOOP-2068
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: stack
>Assignee: Bryan Duxbury
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: rest-11-27-07-v2.patch, rest-11-27-07.3.patc, 
> rest-11-27-07.patch, rest-11-28-07.2.patch, rest-11-28-07.3.patch, 
> rest-11-28-07.patch, rest.patch
>
>
> A RESTful interface would be one means of making hbase accessible to clients 
> that are not java.  It might look something like the below:
> + An HTTP GET of  http://MASTER:PORT/ outputs the master's attributes: online 
> meta regions, list of tables, etc.: i.e. what you see now when you go to 
> http://MASTER:PORT/master.jsp.
> + An HTTP GET of http://MASTER:PORT/TABLENAME: 200 if tables exists and 
> HTableDescription (mimetype: text/plain or text/xml) or 401 if no such table. 
>  HTTP DELETE would drop the table.  HTTP PUT would add one.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW: 200 if row exists and 401 
> if not.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNFAMILY: 
> HColumnDescriptor (mimetype: text/plain or text/xml) or 401 if no such table.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/: 200 and latest 
> version (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE 
> would delete the cell.  HTTP PUT would add a new version.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/TIMESTAMP: 200 
> (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE would 
> remove.  HTTP PUT would put this record.
> + Browser originally goes against master but master then redirects to the 
> hosting region server to serve, update, delete, etc. the addressed cell

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2068) [hbase] RESTful interface

2007-12-03 Thread Billy Pearson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547972
 ] 

Billy Pearson commented on HADOOP-2068:
---

I got the put option working now its returning 200 and the data is in the table 
now so thats good.

But to use the put option you have to save the data to a file before putting 
the data via php curl.
Thats an extra step on inserting data in to the tables I would like to skip and 
use post option.


> [hbase] RESTful interface
> -
>
> Key: HADOOP-2068
> URL: https://issues.apache.org/jira/browse/HADOOP-2068
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: stack
>Assignee: Bryan Duxbury
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: rest-11-27-07-v2.patch, rest-11-27-07.3.patc, 
> rest-11-27-07.patch, rest-11-28-07.2.patch, rest-11-28-07.3.patch, 
> rest-11-28-07.patch, rest.patch
>
>
> A RESTful interface would be one means of making hbase accessible to clients 
> that are not java.  It might look something like the below:
> + An HTTP GET of  http://MASTER:PORT/ outputs the master's attributes: online 
> meta regions, list of tables, etc.: i.e. what you see now when you go to 
> http://MASTER:PORT/master.jsp.
> + An HTTP GET of http://MASTER:PORT/TABLENAME: 200 if tables exists and 
> HTableDescription (mimetype: text/plain or text/xml) or 401 if no such table. 
>  HTTP DELETE would drop the table.  HTTP PUT would add one.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW: 200 if row exists and 401 
> if not.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNFAMILY: 
> HColumnDescriptor (mimetype: text/plain or text/xml) or 401 if no such table.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/: 200 and latest 
> version (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE 
> would delete the cell.  HTTP PUT would add a new version.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/TIMESTAMP: 200 
> (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE would 
> remove.  HTTP PUT would put this record.
> + Browser originally goes against master but master then redirects to the 
> hosting region server to serve, update, delete, etc. the addressed cell

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2316) [hbase] Run REST servlet outside of master

2007-12-03 Thread Bryan Duxbury (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Duxbury updated HADOOP-2316:
--

Status: Patch Available  (was: Open)

Submitting patch

> [hbase] Run REST servlet outside of master
> --
>
> Key: HADOOP-2316
> URL: https://issues.apache.org/jira/browse/HADOOP-2316
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Assignee: Bryan Duxbury
>Priority: Minor
> Attachments: rest-external-v2.patch, rest-external.patch
>
>
> In order to support desired deployment strategy, we need to be able to run 
> the REST servlet independently of the master info server. We should add an 
> new option to the bin/hbase command ("rest"?) that optionally takes a port 
> and bind address and starts the servlet outside of any other project.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2068) [hbase] RESTful interface

2007-12-03 Thread Billy Pearson (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547984
 ] 

Billy Pearson commented on HADOOP-2068:
---

not sure how this happens but here is the screen output
I thank this is the reasion I was getting 200 html code btu could not see the 
results in shell.
I rak this after restarting all hadoop and hbase so nothing should be cached in 
any sense

{code:xml} 
[EMAIL PROTECTED] bin]# pwd
/hadoop/src/contrib/hbase/bin
[EMAIL PROTECTED] bin]# curl -v http://192.168.1.200:60010/api/webdata/row/10/
* About to connect() to 192.168.1.200 port 60010
*   Trying 192.168.1.200... * connected
* Connected to 192.168.1.200 (192.168.1.200) port 60010
> GET /api/webdata/row/10/ HTTP/1.1
User-Agent: curl/7.12.1 (i686-redhat-linux-gnu) libcurl/7.12.1 OpenSSL/0.9.7a 
zlib/1.2.1.2 libidn/0.5.6
Host: 192.168.1.200:60010
Pragma: no-cache
Accept: */*

< HTTP/1.1 200 OK
< Date: Mon, 03 Dec 2007 20:51:30 GMT
< Server: Jetty/5.1.4 (Linux/2.6.9-55.0.12.ELsmp i386 java/1.5.0_12
< Content-Type: text/xml;charset=UTF-8
< Transfer-Encoding: chunked


 
  
stime:
  
  
NDU2
  
 
 
  
stime:now
  
  
Nzg5
  
 
* Connection #0 to host 192.168.1.200 left intact
* Closing connection #0
[EMAIL PROTECTED] bin]# ./hbase shell
Hbase Shell, 0.0.2 version.
Copyright (c) 2007 by udanax, licensed to Apache Software Foundation.
Type 'help;' for usage.

hql > select * from webdata;
+-+-+-+
| Row | Column  | Cell|
+-+-+-+
0 row(s) in set (0.58 sec)
hql > exit;
[EMAIL PROTECTED] bin]#
{code} 


> [hbase] RESTful interface
> -
>
> Key: HADOOP-2068
> URL: https://issues.apache.org/jira/browse/HADOOP-2068
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: stack
>Assignee: Bryan Duxbury
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: rest-11-27-07-v2.patch, rest-11-27-07.3.patc, 
> rest-11-27-07.patch, rest-11-28-07.2.patch, rest-11-28-07.3.patch, 
> rest-11-28-07.patch, rest.patch
>
>
> A RESTful interface would be one means of making hbase accessible to clients 
> that are not java.  It might look something like the below:
> + An HTTP GET of  http://MASTER:PORT/ outputs the master's attributes: online 
> meta regions, list of tables, etc.: i.e. what you see now when you go to 
> http://MASTER:PORT/master.jsp.
> + An HTTP GET of http://MASTER:PORT/TABLENAME: 200 if tables exists and 
> HTableDescription (mimetype: text/plain or text/xml) or 401 if no such table. 
>  HTTP DELETE would drop the table.  HTTP PUT would add one.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW: 200 if row exists and 401 
> if not.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNFAMILY: 
> HColumnDescriptor (mimetype: text/plain or text/xml) or 401 if no such table.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/: 200 and latest 
> version (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE 
> would delete the cell.  HTTP PUT would add a new version.
> + An HTTP GET of http://MASTER:PORT/TABLENAME/ROW/COLUMNNAME/TIMESTAMP: 200 
> (mimetype: binary/octet-stream) or 401 if no such cell. HTTP DELETE would 
> remove.  HTTP PUT would put this record.
> + Browser originally goes against master but master then redirects to the 
> hosting region server to serve, update, delete, etc. the addressed cell

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2229) Provide a simple login implementation

2007-12-03 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HADOOP-2229:
--

Attachment: ugi6.patch

This patch fixed the findbug errors and incorporated Owen's comments. Also 
instead of using the command "id", it uses "whoami" and "groups" to get the 
current user's name & goups list.

> Provide a simple login implementation
> -
>
> Key: HADOOP-2229
> URL: https://issues.apache.org/jira/browse/HADOOP-2229
> Project: Hadoop
>  Issue Type: New Feature
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Hairong Kuang
> Fix For: 0.16.0
>
> Attachments: ugi.patch, ugi1.patch, ugi2.patch, ugi3.patch, 
> ugi4.patch, ugi5.patch, ugi6.patch
>
>
> Give a simple implementation of HADOOP-1701.Hadoop clients are assumed to 
> be started within a Unix-like network which provides user and group 
> management.  This implementation read user information from the OS and send 
> them to the NameNode in plaintexts through RPC (see also HADOOP-2184).  
> NameNode trusts all information given and uses them for permission checking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2229) Provide a simple login implementation

2007-12-03 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HADOOP-2229:
--

Status: Patch Available  (was: Open)

> Provide a simple login implementation
> -
>
> Key: HADOOP-2229
> URL: https://issues.apache.org/jira/browse/HADOOP-2229
> Project: Hadoop
>  Issue Type: New Feature
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Hairong Kuang
> Fix For: 0.16.0
>
> Attachments: ugi.patch, ugi1.patch, ugi2.patch, ugi3.patch, 
> ugi4.patch, ugi5.patch, ugi6.patch
>
>
> Give a simple implementation of HADOOP-1701.Hadoop clients are assumed to 
> be started within a Unix-like network which provides user and group 
> management.  This implementation read user information from the OS and send 
> them to the NameNode in plaintexts through RPC (see also HADOOP-2184).  
> NameNode trusts all information given and uses them for permission checking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2268) JobControl classes should use interfaces rather than implemenations

2007-12-03 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547993
 ] 

Doug Cutting commented on HADOOP-2268:
--

> Another option would be to leave those 3 methods returning ArrayList [...]

I'm okay with that.

> JobControl classes should use interfaces rather than implemenations
> ---
>
> Key: HADOOP-2268
> URL: https://issues.apache.org/jira/browse/HADOOP-2268
> Project: Hadoop
>  Issue Type: Improvement
>  Components: mapred
>Affects Versions: 0.15.0
>Reporter: Adrian Woodhead
>Assignee: Adrian Woodhead
>Priority: Minor
> Fix For: 0.16.0
>
> Attachments: HADOOP-2268-1.patch, HADOOP-2268-2.patch
>
>
> See HADOOP-2202 for background on this issue. Arun C. Murthy agrees that when 
> possible it is preferable to program against the interface rather than a 
> concrete implementation (more flexible, allows for changes of the 
> implementation in future etc.) JobControl currently exposes running, waiting, 
> ready, successful and dependent jobs as ArrayList rather than List. I propose 
> to change this to List.
> I will code up a patch for this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2288) Change FileSystem API to support access control.

2007-12-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HADOOP-2288:
---

Attachment: 2288_20071203.patch

use "ls -ld" instead of "stat"

> Change FileSystem API to support access control.
> 
>
> Key: HADOOP-2288
> URL: https://issues.apache.org/jira/browse/HADOOP-2288
> Project: Hadoop
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.16.0
>
> Attachments: 2288_20071203.patch
>
>
> - Some FileSystem methods like create and mkdir need an additional parameter 
> for permission.
> - FileSystem has to provide methods for setting permission, changing 
> ownership, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2288) Change FileSystem API to support access control.

2007-12-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HADOOP-2288:
---

Status: Open  (was: Patch Available)

> Change FileSystem API to support access control.
> 
>
> Key: HADOOP-2288
> URL: https://issues.apache.org/jira/browse/HADOOP-2288
> Project: Hadoop
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.16.0
>
> Attachments: 2288_20071203.patch
>
>
> - Some FileSystem methods like create and mkdir need an additional parameter 
> for permission.
> - FileSystem has to provide methods for setting permission, changing 
> ownership, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2288) Change FileSystem API to support access control.

2007-12-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HADOOP-2288:
---

Status: Patch Available  (was: Open)

> Change FileSystem API to support access control.
> 
>
> Key: HADOOP-2288
> URL: https://issues.apache.org/jira/browse/HADOOP-2288
> Project: Hadoop
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.16.0
>
> Attachments: 2288_20071203.patch
>
>
> - Some FileSystem methods like create and mkdir need an additional parameter 
> for permission.
> - FileSystem has to provide methods for setting permission, changing 
> ownership, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2327) Streaming: need to be able to re-run specific map tasks (when -reducer NONE)

2007-12-03 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548003
 ] 

Doug Cutting commented on HADOOP-2327:
--

Wouldn't it be better to invest in a more configurable and extensible retry 
mechanism for maps?  If a task fails, we should have hooks that permit cleanup 
of any side-effect data before retry and/or to move final results into place on 
success.  If we had to choose, I'd rather have that than the ability to re-run 
particular tasks by hand.

Another approach to this might be bypass tasks whose output already exists.  
Then one can simply re-run the original job and only those tasks whose output 
does not exist would require execution.  For example, the InputFormat could 
check which outputs already exist and not generate input splits for those.  
Could that work?

> Streaming: need to be able to re-run specific map tasks (when -reducer NONE)
> 
>
> Key: HADOOP-2327
> URL: https://issues.apache.org/jira/browse/HADOOP-2327
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/streaming
>Reporter: arkady borkovsky
>
> Sometimes, a few map tasks fail and -reducer NONE.  
> It should be possible to rerun the failed map tasks .
> There are several failure modes:
>* a task is hanging, so the job is killed
>* from the infrastructure perspective, the task has completed successfully 
> , but it failed to produces correct result
>* failed in the proper Hadoop sense
> It is often too expensive to rerun the whole job.  And for larger jobs, 
> chances are each run will have a few failed tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-1841) IPC server should write repsonses asynchronously

2007-12-03 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548004
 ] 

Doug Cutting commented on HADOOP-1841:
--

> Please let me know is it addresses your concerns.

The unit test sounds good.  Do we have any particular field uses where we think 
this will improve performance?

> IPC server should write repsonses asynchronously
> 
>
> Key: HADOOP-1841
> URL: https://issues.apache.org/jira/browse/HADOOP-1841
> Project: Hadoop
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Doug Cutting
>Assignee: dhruba borthakur
> Attachments: asyncRPC-2.patch, asyncRPC-4.patch, asyncRPC.patch, 
> asyncRPC.patch
>
>
> Hadoop's IPC Server currently writes responses from request handler threads 
> using blocking writes.  Performance and scalability might be improved if 
> responses were written asynchronously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2231) ShellCommand, in particular 'df -k', sometimes hang

2007-12-03 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated HADOOP-2231:
---

Attachment: HADOOP-2231.patch

> ShellCommand, in particular 'df -k', sometimes hang
> ---
>
> Key: HADOOP-2231
> URL: https://issues.apache.org/jira/browse/HADOOP-2231
> Project: Hadoop
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 0.15.1
>Reporter: Christian Kunz
> Attachments: HADOOP-2231.patch
>
>
> We noticed that some pipes applications writing to dfs using libhdfs have 
> about 6% chance of hanging when executing 'df -k' to find out whether there 
> is enough space available on the local filesystem before opening a file for 
> write.
> Why not using File.getFreeSpace() or File.GetUsableSpace()?
> The call stack is:
> Exception in thread "main" java.io.IOException
>  at org.apache.hadoop.fs.ShellCommand.runCommand
> (ShellCommand.java:52)
>  at org.apache.hadoop.fs.ShellCommand.run(ShellCommand.java:42)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:72)
>  at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:264)
>  at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:294)
>  at
> org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite
> (LocalDirAllocator.java:155)
>  at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.newBackupFile(DFSClient.java:1470)
>  at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.openBackupStream(DFSClient.java:1437)
>  at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk
> (DFSClient.java:1579)
>  at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk
> (FSOutputSummer.java:140)
>  at org.apache.hadoop.fs.FSOutputSummer.write1
> (FSOutputSummer.java:100)
>  at org.apache.hadoop.fs.FSOutputSummer.write
> (FSOutputSummer.java:86)
>  at org.apache.hadoop.fs.FSDataOutputStream
> $PositionCache.write(FSDataOutputStream.java:39)
>  at java.io.DataOutputStream.write(DataOutputStream.java:90)
>  at java.io.FilterOutputStream.write(FilterOutputStream.java:80)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2316) [hbase] Run REST servlet outside of master

2007-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548009
 ] 

Hadoop QA commented on HADOOP-2316:
---

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12370864/rest-external-v2.patch
against trunk revision r600627.

@author +1.  The patch does not contain any @author tags.

javadoc +1.  The javadoc tool did not generate any warning messages.

javac +1.  The applied patch does not generate any new compiler warnings.

findbugs +1.  The patch does not introduce any new Findbugs warnings.

core tests +1.  The patch passed core unit tests.

contrib tests -1.  The patch failed contrib unit tests.

Test results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1246/testReport/
Findbugs warnings: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1246/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1246/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1246/console

This message is automatically generated.

> [hbase] Run REST servlet outside of master
> --
>
> Key: HADOOP-2316
> URL: https://issues.apache.org/jira/browse/HADOOP-2316
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Assignee: Bryan Duxbury
>Priority: Minor
> Attachments: rest-external-v2.patch, rest-external.patch
>
>
> In order to support desired deployment strategy, we need to be able to run 
> the REST servlet independently of the master info server. We should add an 
> new option to the bin/hbase command ("rest"?) that optionally takes a port 
> and bind address and starts the servlet outside of any other project.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2231) ShellCommand, in particular 'df -k', sometimes hang

2007-12-03 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548010
 ] 

Amar Kamat commented on HADOOP-2231:


There are many cases where {{df -k}} can hang like,
1. If it encounters a faulty NFS mount. In this case {{df -k}} hangs for ever 
waiting for the NFS.
2. If it encounters a faulty file system/block.
3. Due to overflow of the inputStream/errorStream buffers leading to some kind 
of deadlock.
Assuming the cause to be {{3}} I am uploading a patch. Similar bug was found in 
{{TaskRunner.java}}. Fixed that in this patch. 
Comments ?

> ShellCommand, in particular 'df -k', sometimes hang
> ---
>
> Key: HADOOP-2231
> URL: https://issues.apache.org/jira/browse/HADOOP-2231
> Project: Hadoop
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 0.15.1
>Reporter: Christian Kunz
> Fix For: 0.15.2
>
> Attachments: HADOOP-2231.patch
>
>
> We noticed that some pipes applications writing to dfs using libhdfs have 
> about 6% chance of hanging when executing 'df -k' to find out whether there 
> is enough space available on the local filesystem before opening a file for 
> write.
> Why not using File.getFreeSpace() or File.GetUsableSpace()?
> The call stack is:
> Exception in thread "main" java.io.IOException
>  at org.apache.hadoop.fs.ShellCommand.runCommand
> (ShellCommand.java:52)
>  at org.apache.hadoop.fs.ShellCommand.run(ShellCommand.java:42)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:72)
>  at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:264)
>  at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:294)
>  at
> org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite
> (LocalDirAllocator.java:155)
>  at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.newBackupFile(DFSClient.java:1470)
>  at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.openBackupStream(DFSClient.java:1437)
>  at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk
> (DFSClient.java:1579)
>  at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk
> (FSOutputSummer.java:140)
>  at org.apache.hadoop.fs.FSOutputSummer.write1
> (FSOutputSummer.java:100)
>  at org.apache.hadoop.fs.FSOutputSummer.write
> (FSOutputSummer.java:86)
>  at org.apache.hadoop.fs.FSDataOutputStream
> $PositionCache.write(FSDataOutputStream.java:39)
>  at java.io.DataOutputStream.write(DataOutputStream.java:90)
>  at java.io.FilterOutputStream.write(FilterOutputStream.java:80)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2299) [hbase] Support inclusive scans

2007-12-03 Thread Bryan Duxbury (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Duxbury updated HADOOP-2299:
--

Attachment: inclusive-stop-row.patch

Adds implementation of InclusiveStopRowFilter and a matching unit test.

> [hbase] Support inclusive scans
> ---
>
> Key: HADOOP-2299
> URL: https://issues.apache.org/jira/browse/HADOOP-2299
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Priority: Trivial
> Attachments: inclusive-stop-row.patch
>
>
> The existing scanner interface excludes the end key from the result range. If 
> you actually want to do an inclusive scan for some reason, you would 
> currently have to guess at the key immediately after the end key, which is a 
> shoddy solution. 
> A new stoprow filter could be created that stops at the end key but also 
> returns it. Then, you could supply an extra parameter to getScanner to say 
> you want an inclusive scan.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2231) ShellCommand, in particular 'df -k', sometimes hang

2007-12-03 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated HADOOP-2231:
---

Fix Version/s: 0.15.2
   Status: Patch Available  (was: Open)

> ShellCommand, in particular 'df -k', sometimes hang
> ---
>
> Key: HADOOP-2231
> URL: https://issues.apache.org/jira/browse/HADOOP-2231
> Project: Hadoop
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 0.15.1
>Reporter: Christian Kunz
> Fix For: 0.15.2
>
> Attachments: HADOOP-2231.patch
>
>
> We noticed that some pipes applications writing to dfs using libhdfs have 
> about 6% chance of hanging when executing 'df -k' to find out whether there 
> is enough space available on the local filesystem before opening a file for 
> write.
> Why not using File.getFreeSpace() or File.GetUsableSpace()?
> The call stack is:
> Exception in thread "main" java.io.IOException
>  at org.apache.hadoop.fs.ShellCommand.runCommand
> (ShellCommand.java:52)
>  at org.apache.hadoop.fs.ShellCommand.run(ShellCommand.java:42)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:72)
>  at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:264)
>  at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:294)
>  at
> org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite
> (LocalDirAllocator.java:155)
>  at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.newBackupFile(DFSClient.java:1470)
>  at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.openBackupStream(DFSClient.java:1437)
>  at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk
> (DFSClient.java:1579)
>  at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk
> (FSOutputSummer.java:140)
>  at org.apache.hadoop.fs.FSOutputSummer.write1
> (FSOutputSummer.java:100)
>  at org.apache.hadoop.fs.FSOutputSummer.write
> (FSOutputSummer.java:86)
>  at org.apache.hadoop.fs.FSDataOutputStream
> $PositionCache.write(FSDataOutputStream.java:39)
>  at java.io.DataOutputStream.write(DataOutputStream.java:90)
>  at java.io.FilterOutputStream.write(FilterOutputStream.java:80)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2329) [Hbase Shell] Addition of Built-In Value Data Types for efficient accessing and stroing data

2007-12-03 Thread Jim Kellerman (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548012
 ] 

Jim Kellerman commented on HADOOP-2329:
---

-1

Adding value types to the HBase server side is not a good idea. Since members 
of a column family can be created on an ad-hoc basis, there would be a lot of 
bookkeeping to do to determine if the family member should be of a particular 
type. And if there were no data about a particular family member, what type 
should it be? 

It would be unacceptable to force all members of a column family to be the same 
type.

Additionally there have been requests to loosen the restrictions on the row key 
being a Text and instead, accepting any WritableComparable as the row key.

HADOOP-2197 would permit applications to tag columns with arbitrary key/value 
pairs. Thus an application could store family member/type information using 
this mechanism.

Bigtable is typeless and I think HBase should be as well.

> [Hbase Shell] Addition of Built-In Value Data Types for efficient accessing 
> and stroing data
> 
>
> Key: HADOOP-2329
> URL: https://issues.apache.org/jira/browse/HADOOP-2329
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Edward Yoon
>Assignee: Edward Yoon
> Fix For: 0.16.0
>
>
> A built-in data type is a fundamental data type that the hbase shell defines.
> (character strings, scalars, ranges, arrays, ... , etc)
> If you need a specialized data type that is not currently provided as a 
> built-in type, 
> you are encouraged to write your own user-defined data type using UDC(not yet 
> implemented).
> (or contribute it for distribution in a future release of hbase shell)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



limiting memory use on namenode?

2007-12-03 Thread Doug Cutting
Over on pig-dev, Sam Pullara pointed out the following Java mechanism 
that permits one to be notified when memory usage exceeds a threshold.


http://java.sun.com/javase/6/docs/api/java/lang/management/MemoryPoolMXBean.html#UsageThreshold

Perhaps we could use something like this on the namenode?  For example, 
when memory usage is too high, the namenode could refuse creation of new 
files.  This would still crash applications, but it would keep the 
filesystem itself from crashing in a way that is hard to recover while 
folks remove excessive (presumably) small files.


Doug


[jira] Commented: (HADOOP-2231) ShellCommand, in particular 'df -k', sometimes hang

2007-12-03 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548013
 ] 

Owen O'Malley commented on HADOOP-2231:
---

Since we are going to Java 1.6 (ala HADOOP-2325), we can and should move to use 
java.io.File.getFreeSpace().

> ShellCommand, in particular 'df -k', sometimes hang
> ---
>
> Key: HADOOP-2231
> URL: https://issues.apache.org/jira/browse/HADOOP-2231
> Project: Hadoop
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 0.15.1
>Reporter: Christian Kunz
> Fix For: 0.15.2
>
> Attachments: HADOOP-2231.patch
>
>
> We noticed that some pipes applications writing to dfs using libhdfs have 
> about 6% chance of hanging when executing 'df -k' to find out whether there 
> is enough space available on the local filesystem before opening a file for 
> write.
> Why not using File.getFreeSpace() or File.GetUsableSpace()?
> The call stack is:
> Exception in thread "main" java.io.IOException
>  at org.apache.hadoop.fs.ShellCommand.runCommand
> (ShellCommand.java:52)
>  at org.apache.hadoop.fs.ShellCommand.run(ShellCommand.java:42)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:72)
>  at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:264)
>  at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:294)
>  at
> org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite
> (LocalDirAllocator.java:155)
>  at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.newBackupFile(DFSClient.java:1470)
>  at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.openBackupStream(DFSClient.java:1437)
>  at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk
> (DFSClient.java:1579)
>  at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk
> (FSOutputSummer.java:140)
>  at org.apache.hadoop.fs.FSOutputSummer.write1
> (FSOutputSummer.java:100)
>  at org.apache.hadoop.fs.FSOutputSummer.write
> (FSOutputSummer.java:86)
>  at org.apache.hadoop.fs.FSDataOutputStream
> $PositionCache.write(FSDataOutputStream.java:39)
>  at java.io.DataOutputStream.write(DataOutputStream.java:90)
>  at java.io.FilterOutputStream.write(FilterOutputStream.java:80)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2288) Change FileSystem API to support access control.

2007-12-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HADOOP-2288:
---

Attachment: (was: 2288_20071201.patch)

> Change FileSystem API to support access control.
> 
>
> Key: HADOOP-2288
> URL: https://issues.apache.org/jira/browse/HADOOP-2288
> Project: Hadoop
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.16.0
>
>
> - Some FileSystem methods like create and mkdir need an additional parameter 
> for permission.
> - FileSystem has to provide methods for setting permission, changing 
> ownership, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2316) [hbase] Run REST servlet outside of master

2007-12-03 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HADOOP-2316:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed (Failure was in the unrelated, reliably erratic TestTableMapReduce -- 
jim is working on fixing its being unpredictable).  Resolving.  Thanks for the 
feature Bryan.

> [hbase] Run REST servlet outside of master
> --
>
> Key: HADOOP-2316
> URL: https://issues.apache.org/jira/browse/HADOOP-2316
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Assignee: Bryan Duxbury
>Priority: Minor
> Attachments: rest-external-v2.patch, rest-external.patch
>
>
> In order to support desired deployment strategy, we need to be able to run 
> the REST servlet independently of the master info server. We should add an 
> new option to the bin/hbase command ("rest"?) that optionally takes a port 
> and bind address and starts the servlet outside of any other project.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2231) ShellCommand, in particular 'df -k', sometimes hang

2007-12-03 Thread Sameer Paranjpye (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548024
 ] 

Sameer Paranjpye commented on HADOOP-2231:
--

+1

> ShellCommand, in particular 'df -k', sometimes hang
> ---
>
> Key: HADOOP-2231
> URL: https://issues.apache.org/jira/browse/HADOOP-2231
> Project: Hadoop
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 0.15.1
>Reporter: Christian Kunz
> Fix For: 0.15.2
>
> Attachments: HADOOP-2231.patch
>
>
> We noticed that some pipes applications writing to dfs using libhdfs have 
> about 6% chance of hanging when executing 'df -k' to find out whether there 
> is enough space available on the local filesystem before opening a file for 
> write.
> Why not using File.getFreeSpace() or File.GetUsableSpace()?
> The call stack is:
> Exception in thread "main" java.io.IOException
>  at org.apache.hadoop.fs.ShellCommand.runCommand
> (ShellCommand.java:52)
>  at org.apache.hadoop.fs.ShellCommand.run(ShellCommand.java:42)
>  at org.apache.hadoop.fs.DF.getAvailable(DF.java:72)
>  at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:264)
>  at org.apache.hadoop.fs.LocalDirAllocator
> $AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:294)
>  at
> org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite
> (LocalDirAllocator.java:155)
>  at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.newBackupFile(DFSClient.java:1470)
>  at org.apache.hadoop.dfs.DFSClient
> $DFSOutputStream.openBackupStream(DFSClient.java:1437)
>  at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk
> (DFSClient.java:1579)
>  at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk
> (FSOutputSummer.java:140)
>  at org.apache.hadoop.fs.FSOutputSummer.write1
> (FSOutputSummer.java:100)
>  at org.apache.hadoop.fs.FSOutputSummer.write
> (FSOutputSummer.java:86)
>  at org.apache.hadoop.fs.FSDataOutputStream
> $PositionCache.write(FSDataOutputStream.java:39)
>  at java.io.DataOutputStream.write(DataOutputStream.java:90)
>  at java.io.FilterOutputStream.write(FilterOutputStream.java:80)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



RE: limiting memory use on namenode?

2007-12-03 Thread dhruba Borthakur
Sanjay and I had a discussion on this one earlier. We thought that this
would help Namenode robustness. We also thought that this was part of
Java 6, and we could make this feature optionally configurable.

Other resource limitations (other than memory) are CPU, network and
disk. We thought that we do not need to monitor those resources.

The monitoring of critical resources and the policy of what action to
take can be outside the actual Namenode process itself.

Thanks,
dhruba



-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 2:04 PM
To: hadoop-dev@lucene.apache.org
Subject: limiting memory use on namenode?

Over on pig-dev, Sam Pullara pointed out the following Java mechanism 
that permits one to be notified when memory usage exceeds a threshold.

http://java.sun.com/javase/6/docs/api/java/lang/management/MemoryPoolMXB
ean.html#UsageThreshold

Perhaps we could use something like this on the namenode?  For example, 
when memory usage is too high, the namenode could refuse creation of new

files.  This would still crash applications, but it would keep the 
filesystem itself from crashing in a way that is hard to recover while 
folks remove excessive (presumably) small files.

Doug


[jira] Updated: (HADOOP-2299) [hbase] Support inclusive scans

2007-12-03 Thread Bryan Duxbury (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Duxbury updated HADOOP-2299:
--

Assignee: Bryan Duxbury
  Status: Patch Available  (was: Open)

Submitting patch.

> [hbase] Support inclusive scans
> ---
>
> Key: HADOOP-2299
> URL: https://issues.apache.org/jira/browse/HADOOP-2299
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Assignee: Bryan Duxbury
>Priority: Trivial
> Attachments: inclusive-stop-row-v2.patch, inclusive-stop-row.patch
>
>
> The existing scanner interface excludes the end key from the result range. If 
> you actually want to do an inclusive scan for some reason, you would 
> currently have to guess at the key immediately after the end key, which is a 
> shoddy solution. 
> A new stoprow filter could be created that stops at the end key but also 
> returns it. Then, you could supply an extra parameter to getScanner to say 
> you want an inclusive scan.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-1841) IPC server should write repsonses asynchronously

2007-12-03 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548022
 ] 

dhruba borthakur commented on HADOOP-1841:
--

Mukubd: Thanks for running the performance numbers.

Doug: This patch (by itself) will not improve performance. It is mostly a bug 
fix. Koji's earlier comment describes how slow clients caused a 
denial-of-service of the namenode on a real cluster. Mukund's performance runs 
show that performance is not impacted. However, I plan on attempting a 
follow-on patch to NameNode that will move the sync-to-edits-log from occuring 
in the RPC Handler thread, thus possibly improving scalability. Do you think 
that this patch is good to go?

Owen: would you like to review the code again? 

> IPC server should write repsonses asynchronously
> 
>
> Key: HADOOP-1841
> URL: https://issues.apache.org/jira/browse/HADOOP-1841
> Project: Hadoop
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Doug Cutting
>Assignee: dhruba borthakur
> Attachments: asyncRPC-2.patch, asyncRPC-4.patch, asyncRPC.patch, 
> asyncRPC.patch
>
>
> Hadoop's IPC Server currently writes responses from request handler threads 
> using blocking writes.  Performance and scalability might be improved if 
> responses were written asynchronously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2329) [Hbase Shell] Addition of Built-In Value Data Types for efficient accessing and stroing data

2007-12-03 Thread Edward Yoon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548017
 ] 

Edward Yoon commented on HADOOP-2329:
-

>> HADOOP-2197 would permit applications to tag columns with arbitrary 
>> key/value pairs. Thus an application could store family member/type 
>> information using this mechanism.

ugh... ok!

> [Hbase Shell] Addition of Built-In Value Data Types for efficient accessing 
> and stroing data
> 
>
> Key: HADOOP-2329
> URL: https://issues.apache.org/jira/browse/HADOOP-2329
> Project: Hadoop
>  Issue Type: New Feature
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Edward Yoon
>Assignee: Edward Yoon
> Fix For: 0.16.0
>
>
> A built-in data type is a fundamental data type that the hbase shell defines.
> (character strings, scalars, ranges, arrays, ... , etc)
> If you need a specialized data type that is not currently provided as a 
> built-in type, 
> you are encouraged to write your own user-defined data type using UDC(not yet 
> implemented).
> (or contribute it for distribution in a future release of hbase shell)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-1841) IPC server should write repsonses asynchronously

2007-12-03 Thread Mukund Madhugiri (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548019
 ] 

Mukund Madhugiri commented on HADOOP-1841:
--

I ran sort benchmark on 500 nodes and here is the data:

trunk:

* sort: 1.776 hr
* random writer: 0.459 hr
* sort validation: 0.341 hr

trunk + patch:

* sort: 1.776 hr
* random writer: 0.423 hr
* sort validation: 0.450 hr


> IPC server should write repsonses asynchronously
> 
>
> Key: HADOOP-1841
> URL: https://issues.apache.org/jira/browse/HADOOP-1841
> Project: Hadoop
>  Issue Type: Improvement
>  Components: ipc
>Reporter: Doug Cutting
>Assignee: dhruba borthakur
> Attachments: asyncRPC-2.patch, asyncRPC-4.patch, asyncRPC.patch, 
> asyncRPC.patch
>
>
> Hadoop's IPC Server currently writes responses from request handler threads 
> using blocking writes.  Performance and scalability might be improved if 
> responses were written asynchronously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



limiting memory use on namenode?

2007-12-03 Thread Doug Cutting
Over on pig-dev, Sam Pullara pointed out the following Java mechanism 
that permits one to be notified when memory usage exceeds a threshold.


http://java.sun.com/javase/6/docs/api/java/lang/management/MemoryPoolMXBean.html#UsageThreshold

Perhaps we could use something like this on the namenode?  For example, 
when memory usage is too high, the namenode could refuse creation of new 
files.  This would still crash applications, but it would keep the 
filesystem itself from crashing in a way that is hard to recover while 
folks remove excessive (presumably) small files.


Doug


[jira] Updated: (HADOOP-2299) [hbase] Support inclusive scans

2007-12-03 Thread Bryan Duxbury (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Duxbury updated HADOOP-2299:
--

Attachment: inclusive-stop-row-v2.patch

Forgot to include modifications to StopRowFilter.java (needed to make an 
attribute protected instead of private).

> [hbase] Support inclusive scans
> ---
>
> Key: HADOOP-2299
> URL: https://issues.apache.org/jira/browse/HADOOP-2299
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/hbase
>Reporter: Bryan Duxbury
>Priority: Trivial
> Attachments: inclusive-stop-row-v2.patch, inclusive-stop-row.patch
>
>
> The existing scanner interface excludes the end key from the result range. If 
> you actually want to do an inclusive scan for some reason, you would 
> currently have to guess at the key immediately after the end key, which is a 
> shoddy solution. 
> A new stoprow filter could be created that stops at the end key but also 
> returns it. Then, you could supply an extra parameter to getScanner to say 
> you want an inclusive scan.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2288) Change FileSystem API to support access control.

2007-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548035
 ] 

Hadoop QA commented on HADOOP-2288:
---

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12370877/2288_20071203.patch
against trunk revision r600627.

@author +1.  The patch does not contain any @author tags.

javadoc +1.  The javadoc tool did not generate any warning messages.

javac +1.  The applied patch does not generate any new compiler warnings.

findbugs -1.  The patch appears to introduce 3 new Findbugs warnings.

core tests -1.  The patch failed core unit tests.

contrib tests -1.  The patch failed contrib unit tests.

Test results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1247/testReport/
Findbugs warnings: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1247/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1247/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1247/console

This message is automatically generated.

> Change FileSystem API to support access control.
> 
>
> Key: HADOOP-2288
> URL: https://issues.apache.org/jira/browse/HADOOP-2288
> Project: Hadoop
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.16.0
>
> Attachments: 2288_20071203.patch
>
>
> - Some FileSystem methods like create and mkdir need an additional parameter 
> for permission.
> - FileSystem has to provide methods for setting permission, changing 
> ownership, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: limiting memory use on namenode?

2007-12-03 Thread Doug Cutting

dhruba Borthakur wrote:

Sanjay and I had a discussion on this one earlier. We thought that this
would help Namenode robustness.


Is there an issue in Jira for this?  If not, should we add one?


We also thought that this was part of
Java 6, and we could make this feature optionally configurable.


The was added in Java 1.5.  But we're planning to move to 1.6 in 0.16 
anyway.  So it does not need to be optional.  Whenever the heap is 
greater than, e.g., 90% of max, I think it would be best to not permit 
the creation of new files, don't you?


Doug




[jira] Updated: (HADOOP-1707) Remove the DFS Client disk-based cache

2007-12-03 Thread dhruba borthakur (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HADOOP-1707:
-

Attachment: clientDiskBuffer9.patch

Make patch compile with JDK 1.5

> Remove the DFS Client disk-based cache
> --
>
> Key: HADOOP-1707
> URL: https://issues.apache.org/jira/browse/HADOOP-1707
> Project: Hadoop
>  Issue Type: Improvement
>  Components: dfs
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.16.0
>
> Attachments: clientDiskBuffer.patch, clientDiskBuffer2.patch, 
> clientDiskBuffer6.patch, clientDiskBuffer7.patch, clientDiskBuffer8.patch, 
> clientDiskBuffer9.patch, DataTransferProtocol.doc, DataTransferProtocol.html
>
>
> The DFS client currently uses a staging file on local disk to cache all 
> user-writes to a file. When the staging file accumulates 1 block worth of 
> data, its contents are flushed to a HDFS datanode. These operations occur 
> sequentially.
> A simple optimization of allowing the user to write to another staging file 
> while simultaneously uploading the contents of the first staging file to HDFS 
> will improve file-upload performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2325) Require Java 6 for release 0.16.

2007-12-03 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548046
 ] 

Owen O'Malley commented on HADOOP-2325:
---

Has anyone tried out the java 6 from http://landonf.bikemonkey.org/code/macosx/ 
?

> Require Java 6 for release 0.16.
> 
>
> Key: HADOOP-2325
> URL: https://issues.apache.org/jira/browse/HADOOP-2325
> Project: Hadoop
>  Issue Type: Improvement
>  Components: build
>Reporter: Doug Cutting
> Fix For: 0.16.0
>
>
> We should require Java 6 for release 0.16.  Java 6 is now available for OS/X. 
>  Hadoop performs much better on Java 6.  And, finally, there are features of 
> Java 6 (like 'df') that would be nice to use.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-2336) Shell commands to access and modify file permissions

2007-12-03 Thread Raghu Angadi (JIRA)
Shell commands to access and modify file permissions


 Key: HADOOP-2336
 URL: https://issues.apache.org/jira/browse/HADOOP-2336
 Project: Hadoop
  Issue Type: New Feature
  Components: fs
Reporter: Raghu Angadi
Assignee: Raghu Angadi
 Fix For: 0.16.0



Hadoop 0.16 includes file permissions in DFS and we need FsShell to support 
common file permissions related commands :
- chown
- chgrp
- chmod

Also output from some of the commands like {{ls -l}} will change to reflect new 
file properties. Aim is to make the above commands look like its Unix/Linux 
couterparts. They will of course support only the subset of the options.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2171) [Hbase Shell] Socket Sever for executing the hql query in other applications.

2007-12-03 Thread Edward Yoon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548041
 ] 

Edward Yoon commented on HADOOP-2171:
-

Guys, it's my very poor question.

I'd like to re-start this issue.
How can i open again?

> [Hbase Shell] Socket Sever for executing the hql query in other applications.
> -
>
> Key: HADOOP-2171
> URL: https://issues.apache.org/jira/browse/HADOOP-2171
> Project: Hadoop
>  Issue Type: New Feature
>Affects Versions: 0.15.0
> Environment: all environments
>Reporter: Edward Yoon
>Assignee: Edward Yoon
> Attachments: hadoop-2171.patch, patch_v01.txt, patch_v02.txt
>
>
> To support (non-java) client, protocols will be defined and developed in 
> socket server.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2288) Change FileSystem API to support access control.

2007-12-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HADOOP-2288:
---

Status: Open  (was: Patch Available)

> Change FileSystem API to support access control.
> 
>
> Key: HADOOP-2288
> URL: https://issues.apache.org/jira/browse/HADOOP-2288
> Project: Hadoop
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.16.0
>
> Attachments: 2288_20071203.patch
>
>
> - Some FileSystem methods like create and mkdir need an additional parameter 
> for permission.
> - FileSystem has to provide methods for setting permission, changing 
> ownership, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-2340) Limiting memory usage on namenode

2007-12-03 Thread dhruba borthakur (JIRA)
Limiting memory usage on namenode
-

 Key: HADOOP-2340
 URL: https://issues.apache.org/jira/browse/HADOOP-2340
 Project: Hadoop
  Issue Type: Improvement
  Components: dfs
Reporter: dhruba borthakur


When memory usage is too high, the namenode could refuse creation of new files. 
 This would still crash applications, but it would keep the filesystem itself 
from crashing in a way that is hard to recover while folks remove excessive 
(presumably) small files.

http://java.sun.com/javase/6/docs/api/java/lang/management/MemoryPoolMXBean.html#UsageThreshold

Other resource limitations (other than memory) are CPU, network and disk. We 
thought that we do not need to monitor those resources. The monitoring of 
critical resources and the policy of what action to take can be outside the 
actual Namenode process itself.

There are two reasons that cause memory pressure on the Namenode. One is the 
creation of a large number of files. This reduces the free memory pool and the 
GC has to work even harder to recycle memory. The other reason is when a burst 
of RPCs arrive at the Namenode (especially Block reports). This spurt causes 
free memory to reduce dramatically within a couple of seconds and makes GC work 
harder. And we know that when GC runs hard, the server threads in the JVM 
starve for CPU, causing timeouts on clients.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2340) Limiting memory usage on namenode

2007-12-03 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548056
 ] 

dhruba borthakur commented on HADOOP-2340:
--

One line of reasoning is that if we never timeout client RPC requests 
(HADOOP-2188), then the above situation will not occur. A GC run on the 
Namenode will cause clients to block and slowdown. My feeling is that we should 
observe the system post-2188 and then decide whether (and policy) we need to 
monitor Namenode resources.



> Limiting memory usage on namenode
> -
>
> Key: HADOOP-2340
> URL: https://issues.apache.org/jira/browse/HADOOP-2340
> Project: Hadoop
>  Issue Type: Improvement
>  Components: dfs
>Reporter: dhruba borthakur
>
> When memory usage is too high, the namenode could refuse creation of new 
> files.  This would still crash applications, but it would keep the filesystem 
> itself from crashing in a way that is hard to recover while folks remove 
> excessive (presumably) small files.
> http://java.sun.com/javase/6/docs/api/java/lang/management/MemoryPoolMXBean.html#UsageThreshold
> Other resource limitations (other than memory) are CPU, network and disk. We 
> thought that we do not need to monitor those resources. The monitoring of 
> critical resources and the policy of what action to take can be outside the 
> actual Namenode process itself.
> There are two reasons that cause memory pressure on the Namenode. One is the 
> creation of a large number of files. This reduces the free memory pool and 
> the GC has to work even harder to recycle memory. The other reason is when a 
> burst of RPCs arrive at the Namenode (especially Block reports). This spurt 
> causes free memory to reduce dramatically within a couple of seconds and 
> makes GC work harder. And we know that when GC runs hard, the server threads 
> in the JVM starve for CPU, causing timeouts on clients.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2288) Change FileSystem API to support access control.

2007-12-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HADOOP-2288:
---

Status: Patch Available  (was: Open)

> Change FileSystem API to support access control.
> 
>
> Key: HADOOP-2288
> URL: https://issues.apache.org/jira/browse/HADOOP-2288
> Project: Hadoop
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.16.0
>
> Attachments: 2288_20071203.patch, 2288_20071203b.patch
>
>
> - Some FileSystem methods like create and mkdir need an additional parameter 
> for permission.
> - FileSystem has to provide methods for setting permission, changing 
> ownership, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2288) Change FileSystem API to support access control.

2007-12-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HADOOP-2288:
---

Attachment: 2288_20071203b.patch

> Change FileSystem API to support access control.
> 
>
> Key: HADOOP-2288
> URL: https://issues.apache.org/jira/browse/HADOOP-2288
> Project: Hadoop
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.16.0
>
> Attachments: 2288_20071203.patch, 2288_20071203b.patch
>
>
> - Some FileSystem methods like create and mkdir need an additional parameter 
> for permission.
> - FileSystem has to provide methods for setting permission, changing 
> ownership, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2229) Provide a simple login implementation

2007-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548062
 ] 

Hadoop QA commented on HADOOP-2229:
---

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12370875/ugi6.patch
against trunk revision r600707.

@author +1.  The patch does not contain any @author tags.

javadoc +1.  The javadoc tool did not generate any warning messages.

javac +1.  The applied patch does not generate any new compiler warnings.

findbugs -1.  The patch appears to introduce 2 new Findbugs warnings.

core tests +1.  The patch passed core unit tests.

contrib tests -1.  The patch failed contrib unit tests.

Test results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1248/testReport/
Findbugs warnings: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1248/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1248/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1248/console

This message is automatically generated.

> Provide a simple login implementation
> -
>
> Key: HADOOP-2229
> URL: https://issues.apache.org/jira/browse/HADOOP-2229
> Project: Hadoop
>  Issue Type: New Feature
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Hairong Kuang
> Fix For: 0.16.0
>
> Attachments: ugi.patch, ugi1.patch, ugi2.patch, ugi3.patch, 
> ugi4.patch, ugi5.patch, ugi6.patch
>
>
> Give a simple implementation of HADOOP-1701.Hadoop clients are assumed to 
> be started within a Unix-like network which provides user and group 
> management.  This implementation read user information from the OS and send 
> them to the NameNode in plaintexts through RPC (see also HADOOP-2184).  
> NameNode trusts all information given and uses them for permission checking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2338) [hbase] NPE in master server

2007-12-03 Thread Jim Kellerman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Kellerman updated HADOOP-2338:
--

Attachment: master.log.gz

Log file from master server.

> [hbase] NPE in master server
> 
>
> Key: HADOOP-2338
> URL: https://issues.apache.org/jira/browse/HADOOP-2338
> Project: Hadoop
>  Issue Type: Bug
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Jim Kellerman
>Assignee: Jim Kellerman
> Fix For: 0.16.0
>
> Attachments: master.log.gz
>
>
> Master gets an NPE after receiving multiple responses from the same server 
> telling the master it has opened a region.
> {code}
> 2007-12-02 20:31:37,515 DEBUG hbase.HRegion - Next sequence id for region 
> postlog,img254/577/02suecia024richardburnson0.jpg,1196619667879 is 73377537
> 2007-12-02 20:31:37,517 INFO  hbase.HRegion - region 
> postlog,img254/577/02suecia024richardburnson0.jpg,1196619667879 available
> 2007-12-02 20:31:39,200 WARN  hbase.HRegionServer - Processing message 
> (Retry: 0)
> java.io.IOException: java.io.IOException: java.lang.NullPointerException
> at org.apache.hadoop.hbase.HMaster.processMsgs(HMaster.java :1484)
> at org.apache.hadoop.hbase.HMaster.regionServerReport(HMaster.java:1423)
> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java
>  :25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0 (Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java
>  :27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at 
> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:82)
> at org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException 
> (RemoteExceptionHandler.java:48)
> at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:759)
> at java.lang.Thread.run(Thread.java:619)
>   case HMsg.MSG_REPORT_PROCESS_OPEN:
> synchronized ( this.assignAttempts) {
>   // Region server has acknowledged request to open region.
>   // Extend region open time by 1/2 max region open time.
> **1484**  assignAttempts.put(region.getRegionName (), 
>   Long.valueOf(assignAttempts.get(
>   region.getRegionName()).longValue() +
>   (this.maxRegionOpenTime / 2)));
> }
> break;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Work started: (HADOOP-2339) [Hbase Shell] Delete command with no WHERE clause

2007-12-03 Thread Edward Yoon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-2339 started by Edward Yoon.

> [Hbase Shell] Delete command with no WHERE clause
> -
>
> Key: HADOOP-2339
> URL: https://issues.apache.org/jira/browse/HADOOP-2339
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Edward Yoon
>Assignee: Edward Yoon
> Fix For: 0.16.0
>
>
> using HbaseAdmin.deleteColumn() method.
> {code}
> DELETE column_name FROM table_name;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-2339) [Hbase Shell] Delete command with no WHERE clause

2007-12-03 Thread Edward Yoon (JIRA)
[Hbase Shell] Delete command with no WHERE clause
-

 Key: HADOOP-2339
 URL: https://issues.apache.org/jira/browse/HADOOP-2339
 Project: Hadoop
  Issue Type: Improvement
  Components: contrib/hbase
Affects Versions: 0.16.0
Reporter: Edward Yoon
Assignee: Edward Yoon
 Fix For: 0.16.0


using HbaseAdmin.deleteColumn() method.

{code}
DELETE column_name FROM table_name;
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



RE: limiting memory use on namenode?

2007-12-03 Thread dhruba Borthakur
I will create a JIRA for this one.

I agree that Namenode should stop creating new files when it has already
used up a certain percentage of main memory.

There are two reasons that cause memory pressure on the Namenode. One is
the creation of a large number of files. This reduces the free memory
pool and the GC has to work even harder to recycle memory. The other
reason is when a burst of RPCs arrive at the Namenode (especially Block
reports). This spurt causes free memory to reduce dramatically within a
couple of seconds and makes GC work harder. And we know that when GC
runs hard, the server threads in the JVM starve for CPU, causing
timeouts on clients.

One line of reasoning is that if we never timeout client RPC requests
(HADOOP-2188), then the above situation will not occur. A GC run on the
Namenode will cause clients to block and slowdown. 

My feeling is that we should observe the system post-2188 and then
decide whether (and policy) we need to monitor Namenode resources.

Thanks,
dhruba



-Original Message-
From: Doug Cutting [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 03, 2007 3:04 PM
To: hadoop-dev@lucene.apache.org
Subject: Re: limiting memory use on namenode?

dhruba Borthakur wrote:
> Sanjay and I had a discussion on this one earlier. We thought that
this
> would help Namenode robustness.

Is there an issue in Jira for this?  If not, should we add one?

> We also thought that this was part of
> Java 6, and we could make this feature optionally configurable.

The was added in Java 1.5.  But we're planning to move to 1.6 in 0.16 
anyway.  So it does not need to be optional.  Whenever the heap is 
greater than, e.g., 90% of max, I think it would be best to not permit 
the creation of new files, don't you?

Doug




[jira] Created: (HADOOP-2337) Trash never closes FileSystem

2007-12-03 Thread Konstantin Shvachko (JIRA)
Trash never closes FileSystem
-

 Key: HADOOP-2337
 URL: https://issues.apache.org/jira/browse/HADOOP-2337
 Project: Hadoop
  Issue Type: Bug
  Components: dfs
Affects Versions: 0.15.0
Reporter: Konstantin Shvachko
 Fix For: 0.16.0


Trash opens FileSystem using Path.getFileSystem() but never closes it.
This happens even if Trash is disabled (trash.interval == 0). 
I think trash should not open file system if it is disabled.
I also think that NameNode should not create a trash Thread when trash is 
disabled, see NameNode.init().


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-2338) [hbase] NPE in master server

2007-12-03 Thread Jim Kellerman (JIRA)
[hbase] NPE in master server


 Key: HADOOP-2338
 URL: https://issues.apache.org/jira/browse/HADOOP-2338
 Project: Hadoop
  Issue Type: Bug
  Components: contrib/hbase
Affects Versions: 0.16.0
Reporter: Jim Kellerman
Assignee: Jim Kellerman
 Fix For: 0.16.0


Master gets an NPE after receiving multiple responses from the same server 
telling the master it has opened a region.

{code}
2007-12-02 20:31:37,515 DEBUG hbase.HRegion - Next sequence id for region 
postlog,img254/577/02suecia024richardburnson0.jpg,1196619667879 is 73377537
2007-12-02 20:31:37,517 INFO  hbase.HRegion - region 
postlog,img254/577/02suecia024richardburnson0.jpg,1196619667879 available
2007-12-02 20:31:39,200 WARN  hbase.HRegionServer - Processing message (Retry: 
0)
java.io.IOException: java.io.IOException: java.lang.NullPointerException
at org.apache.hadoop.hbase.HMaster.processMsgs(HMaster.java :1484)
at org.apache.hadoop.hbase.HMaster.regionServerReport(HMaster.java:1423)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java
 :25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0 (Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java
 :27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at 
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:82)
at org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException 
(RemoteExceptionHandler.java:48)
at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:759)
at java.lang.Thread.run(Thread.java:619)


  case HMsg.MSG_REPORT_PROCESS_OPEN:
synchronized ( this.assignAttempts) {
  // Region server has acknowledged request to open region.
  // Extend region open time by 1/2 max region open time.
**1484**  assignAttempts.put(region.getRegionName (), 
  Long.valueOf(assignAttempts.get(
  region.getRegionName()).longValue() +
  (this.maxRegionOpenTime / 2)));
}
break;
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-2342) create a micro-benchmark for measure local-file versus hdfs read

2007-12-03 Thread Owen O'Malley (JIRA)
create a micro-benchmark for measure local-file versus hdfs read


 Key: HADOOP-2342
 URL: https://issues.apache.org/jira/browse/HADOOP-2342
 Project: Hadoop
  Issue Type: Test
  Components: dfs
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.16.0


We should have a benchmark that measures reading a 10g file from hdfs and from 
local disk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2339) [Hbase Shell] Delete command with no WHERE clause

2007-12-03 Thread Edward Yoon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Yoon updated HADOOP-2339:


Attachment: 2339.patch

{code}
hql > select * from udanax;
+-+-+-+
| Row | Column  | Cell|
+-+-+-+
| row1| a:  | aa  |
+-+-+-+
| row1| b:  | aa  |
+-+-+-+
| row2| b:  | aa  |
+-+-+-+
3 row(s) in set. (0.06 sec)
hql > delete b: from udanax;
07/12/04 09:30:15 INFO hbase.HBaseAdmin: Disabled table udanax
07/12/04 09:30:15 INFO hbase.HBaseAdmin: Enabled table udanax
Column(s) deleted successfully. (10.16 sec)
hql > select * from udanax;
+-+-+-+
| Row | Column  | Cell|
+-+-+-+
| row1| a:  | aa  |
+-+-+-+
1 row(s) in set. (0.05 sec)
hql >
{code}

> [Hbase Shell] Delete command with no WHERE clause
> -
>
> Key: HADOOP-2339
> URL: https://issues.apache.org/jira/browse/HADOOP-2339
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Edward Yoon
>Assignee: Edward Yoon
> Fix For: 0.16.0
>
> Attachments: 2339.patch
>
>
> using HbaseAdmin.deleteColumn() method.
> {code}
> DELETE column_name FROM table_name;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2339) [Hbase Shell] Delete command with no WHERE clause

2007-12-03 Thread Edward Yoon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Yoon updated HADOOP-2339:


Status: Patch Available  (was: In Progress)

submitting.
I'd like to commit to 0.16 TRUNK.

> [Hbase Shell] Delete command with no WHERE clause
> -
>
> Key: HADOOP-2339
> URL: https://issues.apache.org/jira/browse/HADOOP-2339
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Edward Yoon
>Assignee: Edward Yoon
> Fix For: 0.16.0
>
> Attachments: 2339.patch
>
>
> using HbaseAdmin.deleteColumn() method.
> {code}
> DELETE column_name FROM table_name;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-2341) Datanode active connections never returns to 0

2007-12-03 Thread Paul Saab (JIRA)
Datanode active connections never returns to 0
--

 Key: HADOOP-2341
 URL: https://issues.apache.org/jira/browse/HADOOP-2341
 Project: Hadoop
  Issue Type: Bug
  Components: dfs
Affects Versions: 0.16.0
Reporter: Paul Saab


On trunk i continue to see the following in my data node logs:

2007-12-03 15:46:47,696 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
active connections is: 42
2007-12-03 15:46:48,135 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
active connections is: 41
2007-12-03 15:46:48,439 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
active connections is: 40
2007-12-03 15:46:48,479 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
active connections is: 39
2007-12-03 15:46:48,611 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
active connections is: 38
2007-12-03 15:46:48,898 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
active connections is: 37
2007-12-03 15:46:48,989 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
active connections is: 36
2007-12-03 15:46:51,010 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
active connections is: 35
2007-12-03 15:46:51,758 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
active connections is: 34
2007-12-03 15:46:52,148 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
active connections is: 33

This number never returns to 0, even after many hours of no new data being 
manipulated or added into the DFS.

Looking at netstat -tn i see significant amount of data in the send-q that 
never goes away:

tcp0  34240 :::XX.XX.XX.XXX:50010   :::YY.YY.YY.YY:55792   
ESTABLISHED 
tcp0  38968 :::XX.XX.XX.XXX:50010   :::YY.YY.YY.YY:38169   
ESTABLISHED 
tcp0  38456 :::XX.XX.XX.XXX:50010   :::YY.YY.YY.YY:35456   
ESTABLISHED 
tcp0  29640 :::XX.XX.XX.XXX:50010   :::YY.YY.YY.YY:59845   
ESTABLISHED 
tcp0  50168 :::XX.XX.XX.XXX:50010   :::YY.YY.YY.YY:44584   
ESTABLISHED 

When sniffing the network I see that the remote side (YY.YY.YY.YY) is returning 
a window size of 0
16:11:41.760474 IP XX.XX.XX.XXX.50010 > YY.YY.YY.YY.44584: . ack 3339984123 win 
46 
16:11:41.761597 IP YY.YY.YY.YY.44584 > XX.XX.XX.XXX.50010: . ack 1 win 0 


Then we look at the stack traces on each datanode, I will have tons of threads 
that *never* go away in the following trace:
{code}
Thread 6516 ([EMAIL PROTECTED]):
  State: RUNNABLE
  Blocked count: 0
  Waited count: 0
  Stack:
java.net.SocketOutputStream.socketWrite0(Native Method)
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
java.net.SocketOutputStream.write(SocketOutputStream.java:136)
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
java.io.DataOutputStream.write(DataOutputStream.java:90)
org.apache.hadoop.dfs.DataNode$BlockSender.sendChunk(DataNode.java:1400)
org.apache.hadoop.dfs.DataNode$BlockSender.sendBlock(DataNode.java:1433)
org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:904)
org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:849)
java.lang.Thread.run(Thread.java:619)
{code}

Unfortunately there's very little in the logs with exceptions that could point 
to this.  I have some exceptions the following, but nothing that points to 
problems between XX and YY:
{code}
2007-12-02 11:19:47,889 WARN  dfs.DataNode - Unexpected error trying to delete 
block blk_4515246476002110310. Block not found in blockMap. 
2007-12-02 11:19:47,922 WARN  dfs.DataNode - java.io.IOException: Error in 
deleting blocks.
at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:750)
at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:675)
at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:569)
at org.apache.hadoop.dfs.DataNode.run(DataNode.java:1720)
at java.lang.Thread.run(Thread.java:619)
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-1823) want InputFormat for bzip2 files

2007-12-03 Thread Doug Cutting (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548075
 ] 

Doug Cutting commented on HADOOP-1823:
--

Why did you need to modify Ant's bzip2 code?  Could it not be used as is?

I'd hate to have to copy this code into Hadoop.  We could create our own jar of 
it, extracted from ant's jar, or perhaps this would be an appropriate place to 
use subversion's "externals" feature.  We could link to a tagged version of the 
sources in Ant's tree.

I note there's also a commons project which has copied this code from ant, but 
it does not yet have any releases.  I guess we could include its nightly jar, 
since it is code that's already been released by Ant...

> want InputFormat for bzip2 files
> 
>
> Key: HADOOP-1823
> URL: https://issues.apache.org/jira/browse/HADOOP-1823
> Project: Hadoop
>  Issue Type: New Feature
>  Components: mapred
>Reporter: Doug Cutting
> Attachments: bzip2.jar
>
>
> Unlike gzip, the bzip file format supports splitting.  Compression is by 
> blocks (900k by default) and blocks are separated by a synchronization marker 
> (a 48-bit approximation of Pi).  This would permit very large compressed 
> files to be split into multiple map tasks, which is not currently possible 
> unless using a Hadoop-specific file format.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2229) Provide a simple login implementation

2007-12-03 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HADOOP-2229:
--

Status: Open  (was: Patch Available)

> Provide a simple login implementation
> -
>
> Key: HADOOP-2229
> URL: https://issues.apache.org/jira/browse/HADOOP-2229
> Project: Hadoop
>  Issue Type: New Feature
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Hairong Kuang
> Fix For: 0.16.0
>
> Attachments: ugi.patch, ugi1.patch, ugi2.patch, ugi3.patch, 
> ugi4.patch, ugi5.patch, ugi6.patch, ugi7.patch
>
>
> Give a simple implementation of HADOOP-1701.Hadoop clients are assumed to 
> be started within a Unix-like network which provides user and group 
> management.  This implementation read user information from the OS and send 
> them to the NameNode in plaintexts through RPC (see also HADOOP-2184).  
> NameNode trusts all information given and uses them for permission checking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2229) Provide a simple login implementation

2007-12-03 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HADOOP-2229:
--

Attachment: ugi7.patch

The patch fixed the new findbug errors.

> Provide a simple login implementation
> -
>
> Key: HADOOP-2229
> URL: https://issues.apache.org/jira/browse/HADOOP-2229
> Project: Hadoop
>  Issue Type: New Feature
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Hairong Kuang
> Fix For: 0.16.0
>
> Attachments: ugi.patch, ugi1.patch, ugi2.patch, ugi3.patch, 
> ugi4.patch, ugi5.patch, ugi6.patch, ugi7.patch
>
>
> Give a simple implementation of HADOOP-1701.Hadoop clients are assumed to 
> be started within a Unix-like network which provides user and group 
> management.  This implementation read user information from the OS and send 
> them to the NameNode in plaintexts through RPC (see also HADOOP-2184).  
> NameNode trusts all information given and uses them for permission checking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2339) [Hbase Shell] Delete command with no WHERE clause

2007-12-03 Thread Edward Yoon (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548073
 ] 

Edward Yoon commented on HADOOP-2339:
-

{code}
DELETE * FROM table_name;
{code}

NOTE: Now just using deleteColumn() method in a loop.

> [Hbase Shell] Delete command with no WHERE clause
> -
>
> Key: HADOOP-2339
> URL: https://issues.apache.org/jira/browse/HADOOP-2339
> Project: Hadoop
>  Issue Type: Improvement
>  Components: contrib/hbase
>Affects Versions: 0.16.0
>Reporter: Edward Yoon
>Assignee: Edward Yoon
> Fix For: 0.16.0
>
> Attachments: 2339.patch
>
>
> using HbaseAdmin.deleteColumn() method.
> {code}
> DELETE column_name FROM table_name;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HADOOP-2229) Provide a simple login implementation

2007-12-03 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang updated HADOOP-2229:
--

Status: Patch Available  (was: Open)

> Provide a simple login implementation
> -
>
> Key: HADOOP-2229
> URL: https://issues.apache.org/jira/browse/HADOOP-2229
> Project: Hadoop
>  Issue Type: New Feature
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Hairong Kuang
> Fix For: 0.16.0
>
> Attachments: ugi.patch, ugi1.patch, ugi2.patch, ugi3.patch, 
> ugi4.patch, ugi5.patch, ugi6.patch, ugi7.patch
>
>
> Give a simple implementation of HADOOP-1701.Hadoop clients are assumed to 
> be started within a Unix-like network which provides user and group 
> management.  This implementation read user information from the OS and send 
> them to the NameNode in plaintexts through RPC (see also HADOOP-2184).  
> NameNode trusts all information given and uses them for permission checking.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2341) Datanode active connections never returns to 0

2007-12-03 Thread dhruba borthakur (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548080
 ] 

dhruba borthakur commented on HADOOP-2341:
--

It appears that there are HDFS clients running on machines marked as 
yy.yy.yy.yy. They are trying to read files from the datanodes.

> Datanode active connections never returns to 0
> --
>
> Key: HADOOP-2341
> URL: https://issues.apache.org/jira/browse/HADOOP-2341
> Project: Hadoop
>  Issue Type: Bug
>  Components: dfs
>Affects Versions: 0.16.0
>Reporter: Paul Saab
>
> On trunk i continue to see the following in my data node logs:
> 2007-12-03 15:46:47,696 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
> active connections is: 42
> 2007-12-03 15:46:48,135 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
> active connections is: 41
> 2007-12-03 15:46:48,439 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
> active connections is: 40
> 2007-12-03 15:46:48,479 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
> active connections is: 39
> 2007-12-03 15:46:48,611 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
> active connections is: 38
> 2007-12-03 15:46:48,898 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
> active connections is: 37
> 2007-12-03 15:46:48,989 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
> active connections is: 36
> 2007-12-03 15:46:51,010 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
> active connections is: 35
> 2007-12-03 15:46:51,758 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
> active connections is: 34
> 2007-12-03 15:46:52,148 DEBUG dfs.DataNode - XX.XX.XX.XXX:50010:Number of 
> active connections is: 33
> This number never returns to 0, even after many hours of no new data being 
> manipulated or added into the DFS.
> Looking at netstat -tn i see significant amount of data in the send-q that 
> never goes away:
> tcp0  34240 :::XX.XX.XX.XXX:50010   :::YY.YY.YY.YY:55792   
> ESTABLISHED 
> tcp0  38968 :::XX.XX.XX.XXX:50010   :::YY.YY.YY.YY:38169   
> ESTABLISHED 
> tcp0  38456 :::XX.XX.XX.XXX:50010   :::YY.YY.YY.YY:35456   
> ESTABLISHED 
> tcp0  29640 :::XX.XX.XX.XXX:50010   :::YY.YY.YY.YY:59845   
> ESTABLISHED 
> tcp0  50168 :::XX.XX.XX.XXX:50010   :::YY.YY.YY.YY:44584   
> ESTABLISHED 
> When sniffing the network I see that the remote side (YY.YY.YY.YY) is 
> returning a window size of 0
> 16:11:41.760474 IP XX.XX.XX.XXX.50010 > YY.YY.YY.YY.44584: . ack 3339984123 
> win 46 
> 16:11:41.761597 IP YY.YY.YY.YY.44584 > XX.XX.XX.XXX.50010: . ack 1 win 0 
> 
> Then we look at the stack traces on each datanode, I will have tons of 
> threads that *never* go away in the following trace:
> {code}
> Thread 6516 ([EMAIL PROTECTED]):
>   State: RUNNABLE
>   Blocked count: 0
>   Waited count: 0
>   Stack:
> java.net.SocketOutputStream.socketWrite0(Native Method)
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
> java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
> java.io.DataOutputStream.write(DataOutputStream.java:90)
> org.apache.hadoop.dfs.DataNode$BlockSender.sendChunk(DataNode.java:1400)
> org.apache.hadoop.dfs.DataNode$BlockSender.sendBlock(DataNode.java:1433)
> org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:904)
> org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:849)
> java.lang.Thread.run(Thread.java:619)
> {code}
> Unfortunately there's very little in the logs with exceptions that could 
> point to this.  I have some exceptions the following, but nothing that points 
> to problems between XX and YY:
> {code}
> 2007-12-02 11:19:47,889 WARN  dfs.DataNode - Unexpected error trying to 
> delete block blk_4515246476002110310. Block not found in blockMap. 
> 2007-12-02 11:19:47,922 WARN  dfs.DataNode - java.io.IOException: Error in 
> deleting blocks.
> at org.apache.hadoop.dfs.FSDataset.invalidate(FSDataset.java:750)
> at org.apache.hadoop.dfs.DataNode.processCommand(DataNode.java:675)
> at org.apache.hadoop.dfs.DataNode.offerService(DataNode.java:569)
> at org.apache.hadoop.dfs.DataNode.run(DataNode.java:1720)
> at java.lang.Thread.run(Thread.java:619)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HADOOP-2288) Change FileSystem API to support access control.

2007-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548081
 ] 

Hadoop QA commented on HADOOP-2288:
---

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12370894/2288_20071203b.patch
against trunk revision r600707.

@author +1.  The patch does not contain any @author tags.

javadoc +1.  The javadoc tool did not generate any warning messages.

javac +1.  The applied patch does not generate any new compiler warnings.

findbugs +1.  The patch does not introduce any new Findbugs warnings.

core tests +1.  The patch passed core unit tests.

contrib tests -1.  The patch failed contrib unit tests.

Test results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1249/testReport/
Findbugs warnings: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1249/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1249/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1249/console

This message is automatically generated.

> Change FileSystem API to support access control.
> 
>
> Key: HADOOP-2288
> URL: https://issues.apache.org/jira/browse/HADOOP-2288
> Project: Hadoop
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 0.15.0
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.16.0
>
> Attachments: 2288_20071203.patch, 2288_20071203b.patch
>
>
> - Some FileSystem methods like create and mkdir need an additional parameter 
> for permission.
> - FileSystem has to provide methods for setting permission, changing 
> ownership, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



  1   2   >