Teradata TD Varchar會比char省空間嗎?

CHAR(n)固定是用掉n Bytes(LATIN; UNICODE則是2倍)

VARCHAR每個字元用掉n Bytes,另外再加2Bytes放長度資訊。

所以,資料是NULL,在

CHAR(10) => 10 BYTES

VARCHAR(10) => 0+2 BYTES

選用規則:視資料內容變動性來決定,如果資料平均變動(少於 N的長度)大於3,則選用VARCHAR在空間上就有節省的空間。

例如:要用CHAR(10) 或是VARCHAR(10)?

看資料平均長度是小於等於7,選VARCHAR(10)就可以省空間。

若有一定的資料是NULL,也可以考慮用CHAR(n) COMPRESS,把NULL壓起來。

VARCHAR(1)、VARCHAR(2)在空間上絕對是不合適的定義。


https://downloads.teradata.com/forum/database/recommendation-on-varhar-char-columns

 CHAR(n) always need exactly n bytes for LATIN or n*2 bytes for UNICODE while VARCHARs need 1 or two bytes for each character stored. But VARCHARs also need two byte additional overhead (a word indicating the length).

Assuming your columns are mainly Y/N or similar in LATIN a single character needs 3 bytes as a VARCHAR, but only 1 byte as CHAR. You never ever do a VARCHAR(1) on Teradata :-) 

When the actual length of a string greatly varies and the average is at least 3 characters than the maximum you start using VARCHARS.

Q2: FastLoad calculates the number of records per block based on the maximum defined length.

If you load VARTEXT there should be no difference, but if you load DATA each VARCHAR will add 2 byes more than CHAR, so in your case saving 400 bytes per row will result in a higer number of rows per block. To check look for Number of recs/msg  in the Insert Phase.

留言