Night helps, and I solved what happened:
- data and data-backup were both mounted, so windows sharing applied for both
- after shutdown Fri 15, and reboot Sat 16, Windows were mounting data-backup 
shares instead of usual data shares
- MacOSX machines mounting Stile CIFS shares, correctly mounted data ones.
- Windows users were silently working on data-backup instead of data
- on Tue 19, relaunching zfs send/recv from data to data-backup, destroyed any 
work of the
windows users since Mon.
- this is why I had no change in snapshots since Mon 16 in data, but only Mac 
works on stile.
What I have to do (this will anyway loose any data changed during monday up to 
the resync):
- halt any work from windows users
- find any changed file in data-backup with date greater than 15 Nov 2013, and 
copy it in data
- export data-backup
- import -N data-backup (no mount, no sharing)
- reboot the storage and see we have only the correct shares
- let people work on the shares and verify they're working on data
I was lucky that these windows users have few files over there (around 30GB).
Gabriele.
Da:
Gabriele Bulfon
A:
[email protected]
Cc:
Raffaele Fullone
Data:
20 novembre 2013 21.01.38 CET
Oggetto:
[discuss] Serious ZFS problem
Hi,
I'm in a very serious situation with a ZFS illumos based storage, that may 
completely render the
solution insane...
I really don't know what happened, so I'll try to describe the history of what 
happened in the last
week.
First, the problem: I found a part of the zfs filesystem back to a past date 
(15 Nov),
but containing snapshots of the following days. The following days snapshots 
are 0 usage,
as if nothing changed, and still new snapshots are 0 usage, as if nothing is 
changing,
and they all contains the situation at that past date.
The system has a zpool with a "data" filesystem, divided into:
- data/stile
- data/windows
- data/windows/*some-different-zfs"
These are all shared on the AD via CIFS.
The system creates periodic recursive snapshots of the data pool every two 
hours during the day,
with a retention over a week, then one every saturday with a retention over a 
month.
Becuase it's a recent installation, we still have not automated the zfs 
send/recv to a backup system,
so I manually run the send/recv every 2/3 days.
On that day, 15 Nov at 17:xx, I created a "backup10" snapshot (knowing to have 
backup9), and
then send 9 to 10 on the receving system.
Then I deleted the old backup on the origin (backup9):
zfs snapshot -r data@backup10
zfs send -Rv -i data@backup9 data@backup10 | zfs receive -Fd data-backup
zfs destroy -r data@backup9
Everything went fine until yesterday, 19 Nov, when I wanted to resync the 
remote system:
zfs snapshot -r data@backup11
zfs send -Rv -i data@backup10 data@backup11 | zfs receive -Fd data-backup
zfs destroy -r data@backup10
Many more snapshots were present in between backup10 and backup11.
After some minutes, I've been called because something was missing.
And after some analisys I found some running filesystems were back to 15 Nov...
As if the system was rolled back to the next snapshot after backup 10 (the 
destroyed one).
But not all, just data/windows/*, not data/stile.
More: all the snapshots after that date, and even the new ones created right 
now, presents always
a 0 usage since then. As if nothing is changing. But people is working, and 
luckily that fs is not used
so much, so just a bunch of files had to be updated by hand.
But ZFS don't show it. Look at one of them:
data/windows/commerciale                               27.5G  1.20T  27.4G  
/data/windows/commerciale
data/windows/commerciale@SATURDAY_2013-11-09_20:00:03  4.73M      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-13_09:00:00      669K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-13_11:00:01      218K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-13_13:00:01      212K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-13_15:00:01      190K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-13_17:00:01      192K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-13_19:00:01       43K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-14_09:00:01       45K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-14_11:00:01      184K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-14_13:00:01      175K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-14_15:00:01      204K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-14_17:00:01      202K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-14_19:00:02       72K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-15_09:00:01      214K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-15_11:00:01      194K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-15_13:00:01      202K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-15_15:00:01      178K      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-15_17:00:02      376K      -  27.4G  -
**here was my backup10 that I destroyed**
data/windows/commerciale@DAILY_2013-11-15_19:00:01       53K      -  27.4G  -
data/windows/commerciale@SATURDAY_2013-11-16_20:00:03      0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-18_09:00:01         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-18_11:00:01         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-18_13:00:02         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-18_15:00:01         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-18_17:00:01         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-18_19:00:00         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-19_09:00:02         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-19_11:00:01         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-19_13:00:01         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-19_15:00:01         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-19_17:00:01         0      -  27.4G  -
data/windows/commerciale@backup11                          0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-19_19:00:01         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-20_09:00:01         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-20_13:00:02         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-20_15:00:01         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-20_17:00:02         0      -  27.4G  -
data/windows/commerciale@DAILY_2013-11-20_19:00:01         0      -  27.4G  -
Look at the fine one (stile):
data/stile                                             1.42T   595G  1.41T  
/data/stile
data/stile@SATURDAY_2013-11-09_20:00:03                 143M      -  1.41T  -
data/stile@DAILY_2013-11-13_09:00:00                   41.1M      -  1.41T  -
data/stile@DAILY_2013-11-13_11:00:01                   9.44M      -  1.41T  -
data/stile@DAILY_2013-11-13_13:00:01                   12.1M      -  1.41T  -
data/stile@DAILY_2013-11-13_15:00:01                   19.2M      -  1.41T  -
data/stile@DAILY_2013-11-13_17:00:01                   12.1M      -  1.41T  -
data/stile@DAILY_2013-11-13_19:00:01                   6.07M      -  1.41T  -
data/stile@DAILY_2013-11-14_09:00:01                    708K      -  1.41T  -
data/stile@DAILY_2013-11-14_11:00:01                   3.10M      -  1.41T  -
data/stile@DAILY_2013-11-14_13:00:01                   24.4M      -  1.41T  -
data/stile@DAILY_2013-11-14_15:00:01                   34.9M      -  1.41T  -
data/stile@DAILY_2013-11-14_17:00:01                    704M      -  1.41T  -
data/stile@DAILY_2013-11-14_19:00:02                    906K      -  1.41T  -
data/stile@DAILY_2013-11-15_09:00:01                    778K      -  1.41T  -
data/stile@DAILY_2013-11-15_11:00:01                   14.9M      -  1.41T  -
data/stile@DAILY_2013-11-15_13:00:01                   1.81M      -  1.41T  -
data/stile@DAILY_2013-11-15_15:00:01                   1.74M      -  1.41T  -
data/stile@DAILY_2013-11-15_17:00:02                   37.9M      -  1.41T  -
**here was my backup10 that I destroyed**
data/stile@DAILY_2013-11-15_19:00:01                    510M      -  1.41T  -
data/stile@SATURDAY_2013-11-16_20:00:03                    0      -  1.41T  -
data/stile@DAILY_2013-11-18_09:00:01                       0      -  1.41T  -
data/stile@DAILY_2013-11-18_11:00:01                    180M      -  1.41T  -
data/stile@DAILY_2013-11-18_13:00:02                   19.4M      -  1.41T  -
data/stile@DAILY_2013-11-18_15:00:01                   11.6M      -  1.41T  -
data/stile@DAILY_2013-11-18_17:00:01                   3.64M      -  1.41T  -
data/stile@DAILY_2013-11-18_19:00:00                     42K      -  1.41T  -
data/stile@DAILY_2013-11-19_09:00:02                     42K      -  1.41T  -
data/stile@DAILY_2013-11-19_11:00:01                   41.1M      -  1.41T  -
data/stile@DAILY_2013-11-19_13:00:01                   31.1M      -  1.41T  -
data/stile@DAILY_2013-11-19_15:00:01                   5.66M      -  1.41T  -
data/stile@DAILY_2013-11-19_17:00:01                   35.2M      -  1.41T  -
data/stile@backup11                                    36.7M      -  1.41T  -
data/stile@DAILY_2013-11-19_19:00:01                    136K      -  1.41T  -
data/stile@DAILY_2013-11-20_09:00:01                    119K      -  1.41T  -
data/stile@DAILY_2013-11-20_13:00:02                   16.0M      -  1.41T  -
data/stile@DAILY_2013-11-20_15:00:01                    134M      -  1.41T  -
data/stile@DAILY_2013-11-20_17:00:02                   7.15M      -  1.41T  -
data/stile@DAILY_2013-11-20_19:00:01                    113K      -  1.41T  -
As you can see, in the fine one, data is changing on every snap.
On the failing one, I can tell you that data is back to the last without a "0".
People needs an explanation, and I can't tell..............
More, how do I get it back to normal?? I don't even have new valid snapshots....
Help!
Gabriele.
illumos-discuss
|
Archives
|
Modify
Your Subscription



-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to