Re: [zfs-discuss] Deleting large amounts of files

2010-07-29 Thread Brandon High
On Tue, Jul 20, 2010 at 9:48 AM, Hernan Freschi  wrote:
> Is there a way to see which files are using dedup? Or should I just
> copy everything  to a new ZFS?

Using 'zfs send' to copy the datasets will work and preserve other
metadata that copying will lose.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deleting large amounts of files

2010-07-29 Thread Constantin Gonzalez

Hi,


Is there a way to see which files have been deduped, so I can copy them again 
an un-dedupe them?


unfortunately, that's not easy (I've tried it :) ).

The issue is that the dedup table (which knows which blocks have been deduped)
doesn't know about files.

And if you pull block pointers for deduped blocks from the dedup table,
you'll need to backtrack from there through the filesystem structure
to figure out what files are associated with those blocks.

(remember: Deduplication happens at the block level, not the file level.)

So, in order to compile a list of deduped _files_, one would need to extract
the list of dedupes _blocks_ from the dedup table, then chase the pointers
from the root of the zpool to the blocks in order to figure out what files
they're associated with.

Unless there's a different way that I'm not aware of (and I hope someone can
correct me here), the only way to do that is run a scrub-like process and
build up a table of files and their blocks.

Cheers,
  Constantin

--

Constantin Gonzalez Schmitz | Principal Field Technologist
Phone: +49 89 460 08 25 91 || Mobile: +49 172 834 90 30
Oracle Hardware Presales Germany

ORACLE Deutschland B.V. & Co. KG | Sonnenallee 1 | 85551 Kirchheim-Heimstetten

ORACLE Deutschland B.V. & Co. KG
Hauptverwaltung: Riesstraße 25, D-80992 München
Registergericht: Amtsgericht München, HRA 95603

Komplementärin: ORACLE Deutschland Verwaltung B.V.
Rijnzathe 6, 3454PV De Meern, Niederlande
Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697
Geschäftsführer: Jürgen Kunz, Marcel van de Molen, Alexander van der Ven

Oracle is committed to developing practices and products that help protect the
environment
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deleting large amounts of files

2010-07-21 Thread Hernan Freschi
On Tue, Jul 20, 2010 at 1:40 PM, Ulrich Graef  wrote:

> When you are writing to a file and currently dedup is enabled, then the
> Data is entered into the dedup table of the pool.
> (There is one dedup table per pool not per zfs).
>
> Switching off the dedup does not change this data.
Yes, i suppose so (just as enabling dedup or compression doesn't alter
on-disk data),

> After switching off dedup, he dedup table is used until this file is deleted
> or overwritten.

> Deleting or overwriting then accesses the dedup table and corrects the
> reference count.

Is there a way to see which files are using dedup? Or should I just
copy everything  to a new ZFS?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deleting large amounts of files

2010-07-21 Thread Ulrich Graef

Hi,

Hernan Freschi wrote:

Hi, thanks for answering,

  

How large is your ARC / your main memory?
  Probably too small to hold all metadata (1/1000 of the data amount).
  => metadata has to be read again and again



Main memory is 8GB. ARC (according to arcstat.pl) usually stays at 5-7GB

  

A recordsize smaller than 128k increases the problem.



recordsize is default, 128k

  

Its a data volume, perhaps raidz or raidz2 and you are using an older ZPOOL 
version?


It's raidz, pool version is 22

  

  Reading is done for the whole raid stripe when you are reading a block.

  => the whole raidz stripe has the attributes of a single disk (see Roch's 
blog).

The number of files is not specified.


some 20 files deleted, each about 4GB in size

  

Updating the dedup table needs random access of the table.


dedup was enabled at some point, but I disabled it long ago. Does it
still matter? Should I copy all these files again (or zfs send) to
un-dedup those blocks?
  

When you are writing to a file and currently dedup is enabled, then the
Data is entered into the dedup table of the pool.
(There is one dedup table per pool not per zfs).

Switching off the dedup does not change this data.
After switching off dedup, he dedup table is used until this file is 
deleted or overwritten.
Deleting or overwriting then accesses the dedup table and corrects the 
reference count.


Therefore you will see effects of the dedup table long after switching 
dedup off,

as long you wrote the file during a time dedup was switched on.


~ 60 reads per second is normal for a sata disk with 7200 RPM.



shouldnt ~60 reads per second at about 128k (not counting prefetch) be
about 7MB/s, instead of the 144kbps (!) I'm getting?
  
ZFS can use smaller blocks, metadata is usually compressed and in the 
dedup table
it is possible, that a dedup block compresses well, when only one slot 
on a block is used,

then a 128k block (logical) can be stored in a 2k physical block.

Regards,

   Ulrich


--
Ulrich Graef / Sales Consultant / Hardware Presales / Phone: + 49 6103 752 359
ORACLE Deutschland B.V. & Co. KG / Amperestr. 6 / 63225 Langen
http://www.oracle.com

ORACLE Deutschland B.V. & Co. KG
Hauptverwaltung: Riesstr. 25, D-80992 Muenchen
Registergericht: Amtsgericht Muenchen, HRA 95603

Komplementaerin: ORACLE Deutschland Verwaltung B.V.
Rijnzathe 6, 3454PV De Meern, Niederlande
Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697
Geschaeftsfuehrer: Juergen Kunz, Marcel van de Molen, Alexander van der Ven

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deleting large amounts of files

2010-07-21 Thread Hernan F
I have 8GB RAM, arcsz as reported by arcstat.pl is 5-7GB usually.

It took about 20-30 mins to delete the files.

Is there a way to see which files have been deduped, so I can copy them again 
an un-dedupe them?

Thanks,
Hernan
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deleting large amounts of files

2010-07-19 Thread Hernan Freschi
Hi, thanks for answering,

> How large is your ARC / your main memory?
>   Probably too small to hold all metadata (1/1000 of the data amount).
>   => metadata has to be read again and again

Main memory is 8GB. ARC (according to arcstat.pl) usually stays at 5-7GB

> A recordsize smaller than 128k increases the problem.

recordsize is default, 128k

> Its a data volume, perhaps raidz or raidz2 and you are using an older ZPOOL 
> version?
It's raidz, pool version is 22

>   Reading is done for the whole raid stripe when you are reading a block.
>
>   => the whole raidz stripe has the attributes of a single disk (see Roch's 
> blog).
>
> The number of files is not specified.
some 20 files deleted, each about 4GB in size

> Updating the dedup table needs random access of the table.
dedup was enabled at some point, but I disabled it long ago. Does it
still matter? Should I copy all these files again (or zfs send) to
un-dedup those blocks?

>
> ~ 60 reads per second is normal for a sata disk with 7200 RPM.

shouldnt ~60 reads per second at about 128k (not counting prefetch) be
about 7MB/s, instead of the 144kbps (!) I'm getting?

>
> so far nothing suprising...
>
>
> Regards,
>
>    Ulrich
>
>
>
> - Original Message -
> From: drge...@gmail.com
> To: zfs-discuss@opensolaris.org
> Sent: Monday, July 19, 2010 5:14:03 PM GMT +01:00 Amsterdam / Berlin / Bern / 
> Rome / Stockholm / Vienna
> Subject: [zfs-discuss] Deleting large amounts of files
>
> Hello,
> I think this is the second time this happens to me. A couple of year ago, I 
> deleted a big (500G) zvol and then the machine started to hang some 20 
> minutes later (out of memory), even rebooting didnt help. But with the great 
> support from Victor Latushkin, who on a weekend helped me debug the problem 
> (abort the transaction and restart it again, which required some black magic 
> and recompiling of ZFS) it worked.
>
> Now I'm facing a similar problem. I was writing about 20GB (from CIFS) to a 
> filesystem. While that was going, I deleted some old files, freeing up about 
> 60GB in the process. After Windows was done deleting those (it was instant), 
> i tried to delete another file, which I didnt have permision to. So I SSHd to 
> the machine and removed it manually (pfexec rm file). And thats where 
> problems started.
>
> First, I noticed the rm wasnt instant. It was taking long (over 5 minutes). I 
> tried Ctrl-C, Ctrl-Z, another SSH and kill, nothing worked. After a while it 
> died with "killed". I did a "zfs list", and noticed the free space wasn't 
> updated.
>
> I tried "sync", it also hangs. I try a reboot - it won't, I guess it's 
> waiting for the sync to finish. So I hard reboot the machine. When it comes 
> back I can access the ZFS pool again. I go to the directory where I tried to 
> delete the files with "rm": files are still there (they weren't before the 
> reboot).
>
> I try a "sync" again. Same result (hang). "top" shows a decreasing amount of 
> free memory. zpool iostat 5 shows:
>
> rpool       69.4G  79.6G      0      0      0      0
> tera        3.12T   513G     63      0   144K      0
> --  -  -  -  -  -  -
> rpool       69.4G  79.6G      0      0      0      0
> tera        3.12T   513G     63      0   142K      0
> --  -  -  -  -  -  -
> rpool       69.4G  79.6G      0      0      0      0
> tera        3.12T   513G     62      0   142K      0
> --  -  -  -  -  -  -
> rpool       69.4G  79.6G      0      0      0      0
> tera        3.12T   513G     64      0   144K      0
> --  -  -  -  -  -  -
> rpool       69.4G  79.6G      0      0      0      0
> tera        3.12T   513G     65      0   148K      0
>
> Could this be related to the fact that I THINK i enabled deduplication on 
> this pool a while ago (but then I disabled it due to performance reasons)?
>
> What should I do? Do I have to wait for these "reads" to finish? Why are they 
> so slow anyway?
>
> Thanks,
> Hernan
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deleting large amounts of files

2010-07-19 Thread Ulrich Graf
Hi,

some information is missing...

How large is your ARC / your main memory?
   Probably too small to hold all metadata (1/1000 of the data amount).
   => metadata has to be read again and again

A recordsize smaller than 128k increases the problem.

Its a data volume, perhaps raidz or raidz2 and you are using an older ZPOOL 
version?
   Reading is done for the whole raid stripe when you are reading a block.

   => the whole raidz stripe has the attributes of a single disk (see Roch's 
blog).

The number of files is not specified.

Updating the dedup table needs random access of the table.

~ 60 reads per second is normal for a sata disk with 7200 RPM.

so far nothing suprising...


Regards,

Ulrich



- Original Message -
From: drge...@gmail.com
To: zfs-discuss@opensolaris.org
Sent: Monday, July 19, 2010 5:14:03 PM GMT +01:00 Amsterdam / Berlin / Bern / 
Rome / Stockholm / Vienna
Subject: [zfs-discuss] Deleting large amounts of files

Hello,
I think this is the second time this happens to me. A couple of year ago, I 
deleted a big (500G) zvol and then the machine started to hang some 20 minutes 
later (out of memory), even rebooting didnt help. But with the great support 
from Victor Latushkin, who on a weekend helped me debug the problem (abort the 
transaction and restart it again, which required some black magic and 
recompiling of ZFS) it worked.

Now I'm facing a similar problem. I was writing about 20GB (from CIFS) to a 
filesystem. While that was going, I deleted some old files, freeing up about 
60GB in the process. After Windows was done deleting those (it was instant), i 
tried to delete another file, which I didnt have permision to. So I SSHd to the 
machine and removed it manually (pfexec rm file). And thats where problems 
started.

First, I noticed the rm wasnt instant. It was taking long (over 5 minutes). I 
tried Ctrl-C, Ctrl-Z, another SSH and kill, nothing worked. After a while it 
died with "killed". I did a "zfs list", and noticed the free space wasn't 
updated. 

I tried "sync", it also hangs. I try a reboot - it won't, I guess it's waiting 
for the sync to finish. So I hard reboot the machine. When it comes back I can 
access the ZFS pool again. I go to the directory where I tried to delete the 
files with "rm": files are still there (they weren't before the reboot).

I try a "sync" again. Same result (hang). "top" shows a decreasing amount of 
free memory. zpool iostat 5 shows:

rpool   69.4G  79.6G  0  0  0  0
tera3.12T   513G 63  0   144K  0
--  -  -  -  -  -  -
rpool   69.4G  79.6G  0  0  0  0
tera3.12T   513G 63  0   142K  0
--  -  -  -  -  -  -
rpool   69.4G  79.6G  0  0  0  0
tera3.12T   513G 62  0   142K  0
--  -  -  -  -  -  -
rpool   69.4G  79.6G  0  0  0  0
tera3.12T   513G 64  0   144K  0
--  -  -  -  -  -  -
rpool   69.4G  79.6G  0  0  0  0
tera3.12T   513G 65  0   148K  0

Could this be related to the fact that I THINK i enabled deduplication on this 
pool a while ago (but then I disabled it due to performance reasons)?

What should I do? Do I have to wait for these "reads" to finish? Why are they 
so slow anyway?

Thanks,
Hernan
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Deleting large amounts of files

2010-07-19 Thread Scott Meilicke
If these files are deduped, and there is not a lot of RAM on the machine, it 
can take a long, long time to work through the dedupe portion. I don't know 
enough to know if that is what you are experiencing, but it could be the 
problem.

How much RAM do you have?

Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Deleting large amounts of files

2010-07-19 Thread Hernan F
Hello,
I think this is the second time this happens to me. A couple of year ago, I 
deleted a big (500G) zvol and then the machine started to hang some 20 minutes 
later (out of memory), even rebooting didnt help. But with the great support 
from Victor Latushkin, who on a weekend helped me debug the problem (abort the 
transaction and restart it again, which required some black magic and 
recompiling of ZFS) it worked.

Now I'm facing a similar problem. I was writing about 20GB (from CIFS) to a 
filesystem. While that was going, I deleted some old files, freeing up about 
60GB in the process. After Windows was done deleting those (it was instant), i 
tried to delete another file, which I didnt have permision to. So I SSHd to the 
machine and removed it manually (pfexec rm file). And thats where problems 
started.

First, I noticed the rm wasnt instant. It was taking long (over 5 minutes). I 
tried Ctrl-C, Ctrl-Z, another SSH and kill, nothing worked. After a while it 
died with "killed". I did a "zfs list", and noticed the free space wasn't 
updated. 

I tried "sync", it also hangs. I try a reboot - it won't, I guess it's waiting 
for the sync to finish. So I hard reboot the machine. When it comes back I can 
access the ZFS pool again. I go to the directory where I tried to delete the 
files with "rm": files are still there (they weren't before the reboot).

I try a "sync" again. Same result (hang). "top" shows a decreasing amount of 
free memory. zpool iostat 5 shows:

rpool   69.4G  79.6G  0  0  0  0
tera3.12T   513G 63  0   144K  0
--  -  -  -  -  -  -
rpool   69.4G  79.6G  0  0  0  0
tera3.12T   513G 63  0   142K  0
--  -  -  -  -  -  -
rpool   69.4G  79.6G  0  0  0  0
tera3.12T   513G 62  0   142K  0
--  -  -  -  -  -  -
rpool   69.4G  79.6G  0  0  0  0
tera3.12T   513G 64  0   144K  0
--  -  -  -  -  -  -
rpool   69.4G  79.6G  0  0  0  0
tera3.12T   513G 65  0   148K  0

Could this be related to the fact that I THINK i enabled deduplication on this 
pool a while ago (but then I disabled it due to performance reasons)?

What should I do? Do I have to wait for these "reads" to finish? Why are they 
so slow anyway?

Thanks,
Hernan
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss