海森矩陣(德語:Hesse-Matrix;英語:Hessian matrix 或 Hessian),又譯作黑塞矩阵、海塞(赛)矩陣或海瑟矩陣等,是一個由多變量實值函數的所有二階偏導數組成的方陣,由德國數學家奧托·黑塞引入並以其命名。
假設有一實值函數
,如果
的所有二階偏導數都存在並在定義域內連續,那麼函數
的黑塞矩陣為
![{\displaystyle \mathbf {H} ={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{n}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{n}}}\\\\\vdots &\vdots &\ddots &\vdots \\\\{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{n}^{2}}}\end{bmatrix}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/23f4db415be866163432946603c07edbc4a21a41)
或使用下標記號表示為
![{\displaystyle \mathbf {H} _{ij}={\frac {\partial ^{2}f}{\partial x_{i}\partial x_{j}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/8aa2fc260814bfe867df5ee7b9ba4d663771ebae)
顯然黑塞矩陣
是一個
方陣。黑塞矩陣的行列式被稱爲黑塞式(英語:Hessian),而需注意的是英語環境下使用Hessian一詞時可能指上述矩陣也可能指上述矩陣的行列式[1]。
由高等數學知識可知,若一元函數
在
點的某個鄰域內具有任意階導數,則函數
在
點處的泰勒展開式為
![{\displaystyle f(x)=f(x_{0})+f'(x_{0})\Delta x+{\frac {f''(x_{0})}{2!}}\Delta x^{2}+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c7099923f582ffe5517e204a02780e3981e6c101)
其中,
。
同理,二元函數
在
點處的泰勒展開式為
![{\displaystyle f(x_{1},x_{2})=f(x_{10},x_{20})+f_{x_{1}}(x_{10},x_{20})\Delta x_{1}+f_{x_{2}}(x_{10},x_{20})\Delta x_{2}+{\frac {1}{2}}[f_{x_{1}x_{1}}(x_{10},x_{20})\Delta x_{1}^{2}+2f_{x_{1}x_{2}}(x_{10},x_{20})\Delta x_{1}\Delta x_{2}+f_{x_{2}x_{2}}(x_{10},x_{20})\Delta x_{2}^{2}]+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6f716863c511e8df89e2239a014ac68cb4552072)
其中,
,
,
,
,
,
,
。
將上述展開式寫成矩陣形式,則有
![{\displaystyle f(x)=f(x_{0})+\nabla f(x_{0})^{\mathrm {T} }\Delta x+{\frac {1}{2}}\Delta x^{\mathrm {T} }G(x_{0})\Delta x+\cdots }](https://wikimedia.org/api/rest_v1/media/math/render/svg/410d9cadefc4015ace1832de2c31dc8163eda8f0)
其中,
,
是
的轉置,
是函數
在
的梯度,矩陣
![{\displaystyle G(x_{0})={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{bmatrix}}_{x_{0}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a05ada63da2ce012b28f4f9a58499b96f3954ffa)
即函數
在
點處的
黑塞矩阵。它是由函数
在
点处的所有二階偏導數所組成的方陣。
由函數的二次連續性,有
![{\displaystyle {\frac {\partial ^{2}f}{\partial x_{1}\partial x_{2}}}={\frac {\partial ^{2}f}{\partial x_{2}\partial x_{1}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/33744c4388fb98b61ce851e5c5e7ecf2c9a59d29)
所以,黑塞矩陣
为對稱矩陣。
將二元函數的泰勒展開式推廣到多元函數,函數
在
點處的泰勒展開式為
![{\displaystyle f(x)=f(x_{0})+\nabla f(x_{0})^{\mathrm {T} }\Delta x+{\frac {1}{2}}\Delta x^{\mathrm {T} }G(x_{0})\Delta x+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c1190639eb839919d8982edaf464e8360e510102)
其中,
![{\displaystyle \nabla f(x_{0})={\begin{bmatrix}{\frac {\partial f}{\partial x_{1}}}&{\frac {\partial f}{\partial x_{2}}}&\cdots &{\frac {\partial f}{\partial x_{n}}}\end{bmatrix}}_{x_{0}}^{T}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7bd56e9fd6534d3dd4209cc775f772751397c224)
為函數
![{\displaystyle f(x)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/202945cce41ecebb6f643f31d119c514bec7a074)
在
![{\displaystyle x_{0}(x_{1},x_{2},\cdots ,x_{n})\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5344a0aafeed11db06bc4b3474b00853ae2a7fa5)
點的梯度,
![{\displaystyle G(x_{0})={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{n}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{n}}}\\\\\vdots &\vdots &\ddots &\vdots \\\\{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{n}^{2}}}\end{bmatrix}}_{x_{0}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/05c00292d8658bcad4960686b04a604d7323d663)
為函數
在
點的
黑塞矩陣。若函數有
次連續性,則函數的
黑塞矩陣是對稱矩陣。
說明:在優化設計領域中,黑塞矩陣常用
表示,且梯度有時用
表示。[2]
函數
的黑塞矩陣和雅可比矩陣有如下關係:
![{\displaystyle \mathrm {H} (f)=\mathrm {J} (\nabla f)^{T}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/2b23bfb757383e1bcb2a908d970a947358b0e769)
即函數
的黑塞矩陣等於其梯度的雅可比矩陣。
函數的極值條件[编辑]
對於一元函数
,在給定區間內某
點處可導,並在
點處取得極值,其必要條件是
![{\displaystyle f'(x_{0})=0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ccb00c237e61f1121e81c87826a52a394a31b720)
即函數
的極值必定在駐點處取得,或者說可導函數
的極值點必定是駐點;但反過來,函數的駐點不一定是極值點。檢驗駐點是否為極值點,可以採用二階導數的正負號來判斷。根據函數
在
點處的泰勒展開式,考慮到上述極值必要條件,有
![{\displaystyle f(x)=f(x_{0})+{\frac {f''(x_{0})}{2!}}\Delta x^{2}+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/307ffc2a2d5a9926c3590d54f540c9f8ef03b1da)
若
在
點處取得極小值,則要求在
某一鄰域內一切點
都必須滿足
![{\displaystyle f(x)-f(x_{0})>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/bc124cae2932535e1ade0cef2fd43eb5b4844062)
即要求
![{\displaystyle {\frac {f''(x_{0})}{2!}}\Delta x^{2}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4ad8dc656e4b7f04ddf2cd716638a480f8d23c03)
亦即要求
![{\displaystyle f''(x_{0})>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/12f46feabcab044cd1440902b7b50a7bf0f2a878)
在
點處取得極大值的討論與之類似。於是有極值充分條件:
設一元函数
在
點處具有二階導數,且
,
,則
- 當
時,函數
在
處取得極小值;
- 當
時,函數
在
處取得極大值。
而當
時,無法直接判斷,還需要逐次檢驗其更高階導數的正負號。由此有一个規律:若其開始不為零的導數階數為偶數,則駐點是極值點;若為奇數,則為拐點,而不是極值點。
對於二元函数
,在給定區域內某
點處可導,並在
點處取得極值,其必要條件是
![{\displaystyle f_{x_{1}}(x_{0})=f_{x_{2}}(x_{0})=0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a0e415d3dd559fcbce0f4360324230e051e145b2)
即
![{\displaystyle \nabla f(x_{0})=0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/de546d17e32536211c8d6ecf39eb17d52acd1c3a)
同樣,這只是必要條件,要進一步判斷
是否為極值點需要找到取得極值的充分條件。根據函數
在
點處的泰勒展開式,考慮到上述極值必要條件,有
![{\displaystyle f(x_{1},x_{2})=f(x_{10},x_{20})+{\frac {1}{2}}[f_{x_{1}x_{1}}(x_{0})\Delta x_{1}^{2}+2f_{x_{1}x_{2}}(x_{0})\Delta x_{1}\Delta x_{2}+f_{x_{2}x_{2}}(x_{0})\Delta x_{2}^{2}]+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ee48435bd67521c76d889b01677bf7c454a822b1)
設
,
,
,則
![{\displaystyle f(x_{1},x_{2})=f(x_{10},x_{20})+{\frac {1}{2}}[A\Delta x_{1}^{2}+2B\Delta x_{1}\Delta x_{2}+C\Delta x_{2}^{2}]+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ae8df8836f38fe763c0bbb2c3d54910d98b40110)
或
![{\displaystyle f(x_{1},x_{2})=f(x_{10},x_{20})+{\frac {1}{2A}}[(A\Delta x_{1}+B\Delta x_{2})^{2}+(AC-B^{2})\Delta x_{2}^{2}]+\cdots \,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/74791d552062316bab8b1607f69c415631ebb6df)
若
在
點處取得極小值,則要求在
某一鄰域內一切點
都必須滿足
![{\displaystyle f(x_{1},x_{2})-f(x_{10},x_{20})>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/fbf9eb2e0bccf2c322984a9cb0486f89efaf0056)
即要求
![{\displaystyle {\frac {1}{2A}}[(A\Delta x_{1}+B\Delta x_{2})^{2}+(AC-B^{2})\Delta x_{2}^{2}]>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f005560ba6efd5567601712a8164c01a2c94bd19)
亦即要求
,
即
![{\displaystyle \left.{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}\right|_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c528aae5ae3061bed869a4add9932af46d6fc9cf)
![{\displaystyle {\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}-({\frac {\partial ^{2}f}{\partial x_{1}\partial x_{2}}})^{2}\end{bmatrix}}_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d9ddd41047c623b067cce9e3fea5d43faa9b6eaa)
此條件反映了
在
點處的黑塞矩陣
的各階主子式都大於零,即對於
![{\displaystyle G(x_{0})={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{bmatrix}}_{x_{0}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a05ada63da2ce012b28f4f9a58499b96f3954ffa)
要求
![{\displaystyle \left.{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}\right|_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c528aae5ae3061bed869a4add9932af46d6fc9cf)
![{\displaystyle |G(x_{0})|={\begin{vmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{vmatrix}}_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/3034ba0cfc2dd22fa6e6b8af7bdc0078c9c15fb6)
在
點處取得極大值的討論與之類似。於是有極值充分條件:
設二元函数
在
點的鄰域內連續且具有一階和二階連續偏導數,又有
,同時令
,
,
,則
- 當
,
時,函數
在
處取得極小值;
- 當
,
時,函數
在
處取得極大值。
此外可以判斷,當
時,函數
在
點處沒有極值,此點稱爲鞍點。而當
時,無法直接判斷,對此,補充一個規律:當
時,如果有
,那麼函數
在
有極值,且當
有極小值,當
有極大值。
由線性代數的知識可知,若矩陣
滿足
![{\displaystyle \left.{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}\right|_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c528aae5ae3061bed869a4add9932af46d6fc9cf)
![{\displaystyle {\begin{vmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{vmatrix}}_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d4e5188c5bf5eea088f17ad7e048a2202649506f)
則矩陣
是正定矩陣,或者說矩陣
正定。
若矩陣
滿足
![{\displaystyle \left.{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}\right|_{x_{0}}<0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/410e4e3dc279b8ea0b5b9baef24343e337b0b32d)
![{\displaystyle {\begin{vmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{vmatrix}}_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d4e5188c5bf5eea088f17ad7e048a2202649506f)
則矩陣
是負定矩陣,或者說矩陣
負定。[3]
於是,二元函數
在
點處取得極值的條件表述為:二元函數
在
點處的黑塞矩陣正定,則取得極小值;在
點處的黑塞矩陣負定,則取得極大值。
對於多元函數
,若在
點處取得極值,則極值存在的必要條件為
![{\displaystyle \nabla f(x_{0})={\begin{bmatrix}{\frac {\partial f}{\partial x_{1}}}&{\frac {\partial f}{\partial x_{2}}}&\cdots &{\frac {\partial f}{\partial x_{n}}}\end{bmatrix}}_{x_{0}}^{T}=0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/cacaa85f910c43c7b3d7dfaebf8b05c2eb1fcf2b)
取得極小值的充分條件為
![{\displaystyle G(x_{0})={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{n}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{n}}}\\\\\vdots &\vdots &\ddots &\vdots \\\\{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{n}^{2}}}\end{bmatrix}}_{x_{0}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/05c00292d8658bcad4960686b04a604d7323d663)
正定,即要求
的各階主子式都大於零,即
![{\displaystyle \left.{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}\right|_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/c528aae5ae3061bed869a4add9932af46d6fc9cf)
![{\displaystyle {\begin{vmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}\end{vmatrix}}_{x_{0}}>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d4e5188c5bf5eea088f17ad7e048a2202649506f)
![{\displaystyle \vdots }](https://wikimedia.org/api/rest_v1/media/math/render/svg/f8039d9feb6596ae092e5305108722975060c083)
![{\displaystyle |G(x_{0})|>0\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/243c218404919cdd0bc052fd06277593317e22bc)
取得極大值的充分條件為
![{\displaystyle G(x_{0})={\begin{bmatrix}{\frac {\partial ^{2}f}{\partial x_{1}^{2}}}&{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{1}\,\partial x_{n}}}\\\\{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{2}^{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{2}\,\partial x_{n}}}\\\\\vdots &\vdots &\ddots &\vdots \\\\{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{1}}}&{\frac {\partial ^{2}f}{\partial x_{n}\,\partial x_{2}}}&\cdots &{\frac {\partial ^{2}f}{\partial x_{n}^{2}}}\end{bmatrix}}_{x_{0}}\,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/05c00292d8658bcad4960686b04a604d7323d663)
負定。[4][5][6]
拓展閱讀[编辑]
參考文獻[编辑]