Can the general structure of a mortgage-backed security (MBS) contract be programmatically represented through the use of decentralized autonomous organizations (DAOs)? Such an approach could allow for the portfolio of loans to be managed by investors in a trustless and transparent way. The focus and scope of this paper is to explore the potential for applying the tools of modern fintech, such as asset tokenization, smart contracts, and DAOs, to reconstruct traditional structured products that have a greater degree of transparency and traceability. MBS investors face considerable value uncertainty as time increases between the actual occurrence (or non-occurrence) of cash flows and subsequent reporting. Given that an MBS is a financial contract, it should be expressible logically using the Algorithmic Contract Types Unified Standards (ACTUS). Since each underlying mortgage in an MBS derives its cash flows in a prescribed way over the life of the contract, implementation on a public blockchain could enable real-time ratings systems, improving market efficiency. We explore the potential for creating formal algorithmic designs of MBS-DAOs that incorporate individual mortgages, the underlying real estate assets (collateral), and any loan guarantees.
We develop an additive Cox proportional hazard model with time-varying covariates, including spatio-temporal characteristics of weather events, to study the impact of weather extremes (heavy rains and tropical cyclones) on the probability of mortgage default and prepayment. We compare the survival model with a flexible logistic model and an extreme gradient boosting algorithm. We estimate the models on a portfolio of mortgages in Florida, consisting of 69,046 loans and 3,707,831 loan-month observations with localization data at the five-digit ZIP code level. We find a statistically significant and non-linear impact of tropical cyclone intensity on default as well a significant impact of heavy rains in areas with large exposure to flood risks. These findings confirm existing results in the literature and also provide estimates of the impact of the extreme event characteristics on mortgage risk, e.g. the impact of tropical cyclones on default more than doubles in magnitude when moving from a hurricane of category two to a hurricane of category three or more. We build on the identified effect of exposure to flood risk (in interaction with heavy rainfall) on mortgage default to perform a scenario analysis of the future impacts of climate change using the First Street flood model, which provides projections of exposure to floods in 2050 under RCP 4.5. We find a systematic increase in risk under climate change that can vary based on the scenario of extreme events considered. Climate-adjusted credit risk allows risk managers to better evaluate the impact of climate-related risks on mortgage portfolios.
The determinants of mortgage default have been an area of rising interest since the 2008 recession. There are two distinguishing features of mortgage default analysis. First, predictor variables are often only recorded at origination. However, many important variables such as credit scores vary over time. Second, there are omitted variables (such as borrower’s income and job security). If omitted variables are correlated with included regressors or if only origination values are used in a dynamic model, then biases may be present in econometric models for default risk. Our focus is to develop a ridge regression model to impute the dynamics of time-varying predictors and to capture unobserved borrower heterogeneity. The model is evaluated using cross-validation, and the relevant parameters are tuned to maximize out-of-sample predictive performance. After allowing for imputed dynamics and borrower heterogeneity, we find that the loan-to-value ratio becomes a larger signal of default risk and that credit scores as well as full documentation become smaller signals of default risk. These changes primarily are driven by imputing static variables, rather than dynamics, and may pertain to either omitted liquidity factors or strategic factors.
The strategy of geographically diversifying a portfolio of commercial real estate assets is an intuitive approach for risk management. However, due to high concentrations of these assets in major metropolitan areas, investors may face additional constraints in the portfolio optimization process. The rank-size rule, a log-linear relationship between city rank and size, provides one of the greatest empirical regularities in regional science. As such, it serves as a possible theoretical guide to the weights given to properties by location in a commercial real estate portfolio. This paper sets forth some ideas relating to the concentration side of portfolio variance and the limiting effect that large concentrations may have on the ability to diversify risk. Two variants of the rank-size relationship – the Zipf distribution and the parabolic fractal distribution – are fitted to a variety of datasets to provide a sense of the degree of concentration in the commercial real estate industry. These empirical findings suggest the presence of limitations to geographical diversification that have varying degrees of severity across different property types or sectors of the commercial real estate market.
The National Flood Insurance Program (NFIP) was created in 1968 and allows homeowners, renters, and businesses to purchase flood insurance from the federal government. During the summer of 2019, without compromising privacy, the Federal Emergency Management Agency (FEMA) released a dataset containing 50 million observations. Researchers can now download and evaluate the 49,514,688 flood policy observations (beginning in 2009) and the 2,418,007 flood claims observations (beginning in 1970) in an easily accessible machine-readable format, bypassing the complex request procedures of the past. What exactly is included in this policy and claims data and how might it be used to examine flood insurance related topics? We provide real estate academics and industry professionals with the details of the 44 usable policy data variables and the 37 usable claims data variables, which we group into seven categories: Locational, Structural, Occupancy, Policy Terms, Zone/Elevation/Rating, Premiums, and Claims. In an effort to aid researchers with the initial complexities of working with the data, we provide sample R-code that can be used and altered to analyze NFIP data. Finally, for illustration, we demonstrate how the NFIP data can be merged with data from both the American Community Survey and Zillow to study the determinants of flood insurance take-up.
Although some have proposed eliminating the National Flood Insurance Program (NFIP) to reduce government expenditures, other alternatives exist that could reduce the cost of the program and increase its viability, such as increasing deductibles, which may increase participation and increase revenue. The recently released FIMA NFIP Redacted Policies Data Set provides unprecedented opportunities to examine homeowner deductible choices for flood insurance policies using policy-level data. The menu of deductibles currently ranges from $1,000 to $10,000 in Special Flood Hazard Areas (SFHAs), but until April 1, 2015 the maximum deductible was $5,000. Using a matched sample of 252,280 SFHA policies that were active for the surrounding 2013–2019 time period, we provide insight regarding characteristics of homeowners who chose the maximum deductible as well as those who switched from the $5,000 to the new $10,000 deductible. Consistent with nudge theory and stickiness, we show that the majority of the homeowners accept the default deductible option. Individuals in high-income and high-premium areas were more likely to select the maximum deductible. The level of education and past flood events do not impact the decision to select the maximum deductible option.
Portfolios of mortgage loans played an important role in the Great Recession and continue to compose a material part of bank assets. This chapter investigates how cross-sectional dependence in the underlying properties flows through to the loan returns, and thus, the risk of the portfolio. At one extreme, a portfolio of foreclosed mortgage loans becomes a portfolio of real estate whose returns exhibit substantial cross-sectional and spatial dependence. Near the other extreme, almost all loans perform and yield constant returns, which do not correlate with other performing loan returns. This suggests that loan performance effectively censors the random returns of the underlying properties. Following the statistical properties of the correlations among censored variables, the authors build off this foundation and show how the loan return correlations will rise as economic conditions deteriorate and the defaulting loans reveal the underlying housing correlations. In this chapter, the authors (1) adapt tools from spatial statistics to document substantial cross-sectional dependence across house price returns and examine the spatial structure of this dependence, (2) investigate the nonlinear nature of correlations among loan returns as a function of the default rate and the underlying house price correlations, and (3) conduct a simulation exercise using parameters from the empirical data to show the implications for holding a portfolio of mortgages.
Dombrowski, T.P., “From Plantations to Blockchains: A Review and Synthesis of the MBS and DeFi Literatures,” Working Paper.
The tools of modern financial technology have enabled the tokenization of real-world assets and the enforcement of smart-contract-based governance for businesses. One area of traditional financial markets that is still in the infancy of development within existing blockchain ecosystems is that of mortgage-backed securities (MBSs). This paper aims to review the history of MBSs and summarize the newer literature regarding decentralized finance (DeFi). Among the more specific topics covered are real estate tokenization, smart contracts, and decentralized autonomous organizations (DAOs).
Dombrowski, T.P. and Seagraves, C., “Optimizing Real Estate Portfolios: The Role of AI in Geographic Diversification,” Working Paper.
Geographic diversification of a real estate portfolio is a common strategy for managing risk. Can ChatGPT be an effective tool for analyzing data and generating a diversified portfolio? We aim to answer this question by conducting an experiment to evaluate the predictive capabilities of ChatGPT for selecting cities in a residential real estate portfolio. This experiment involves building a dataset that includes city-level housing data from Zillow and Google Trends data for real estate interest at the city level. That data is then provided to ChatGPT, which is prompted to analyze the data and select cities for an investment portfolio. Those selections are then used to create several portfolios, which then get backtested and compared against some benchmark portfolios.
Dombrowski, T.P., Narayanan, R.P., and Pace, R.K., “Concentration Risk in Mortgage Portfolios: A Rank-Size Approach,” Working Paper.
Geographical diversification is an intuitive approach to manage risk in a mortgage portfolio. This paper examines whether high levels of geographical concentration in mortgage debt hinder the ability to diversify risk. To accomplish this, I apply the empirical regularity in regional science known as the rank-size rule, which is a log-linear relationship between city size and rank. With this, I estimate the degree of concentration across various sectors of the mortgage market. These results are improved upon by expanding to a non-linear model, which addresses an important concern regarding the linear fit. These concentration estimates suggest that there may be limits to diversification in heavily concentrated markets, such as the jumbo mortgage market. These jumbo loans tend to be held by systemically important financial institutions due to regulatory constraints preventing their purchase and securitization by the government sponsored enterprises. The insights from this approach provide a simplification for the varying sensitivities of different sectors in the mortgage market to local economic shocks.
Portfolios of mortgage loans played an important role in the Great Recession and continue to compose a material part of bank assets. The distribution of mortgage portfolio returns, and consequently, the risk of these portfolios, is quite distinct even from other fixed income asset classes. This dissertation contains three essays, each aiming to analyze a specific component of risk in mortgage portfolios and role of geographical diversification in reducing this risk. The first essay investigates how cross-sectional dependence in the underlying properties flows through to the loan returns, and thus, the risk of the portfolio. In addition to demonstrating this relationship theoretically, this essay demonstrates how the spatially dependent structure of the underlying housing returns is revealed in the mortgage market by a shock to the default rate. The resulting increase in the asset correlations reduces the effectiveness of any geographical diversification present in the portfolio. Even when the distribution of mortgage returns is known, the ability to reduce portfolio risk through geographical diversification can be limited due to the concentration of mortgage debt in major metropolitan areas. The second essay aims to model this geographical concentration for various partitions of the mortgage market and examine the role this has on limiting investors’ ability to diversify risk. This is accomplished by fitting the empirical regularity from regional science known as the rank-size rule to measure this concentration. The third and final essay in this dissertation focuses on modeling the mortgage default decision and imputing unobserved factors that may bias the estimated impact of observed factors such as the loan-to-value ratio. As alluded to in the first essay, the default rate, or probability of default ex-ante, is an important determinant of the observed correlation across mortgage returns. This essay develops a ridge regression model, which is tuned to maximize out-of-sample predictive performance using cross-validation, that imputes these unobserved factors while preventing model overfitting.