Abstract
Decentralized optimization has attracted much research interest for resourcelimited networked multiagent systems in recent years. Decentralized _{T}consensus optimization, which is one of the decentralized optimization problems of great practical importance, minimizes an objective function that is the sum of the terms from individual agents over a set of variables on which all the agents should reach a consensus. This problem can be reformulated into an equivalent model with two blocks of variables, which can then be solved by the alternating direction method (ADM) with only communications between neighbor nodes. Motivated by a recently emerged class of socalled multiblock ADMs, this article demonstrates that it is more natural to reformulate a decentralized consensus optimization problem to one with multiple blocks of variables and solve it by a multiblock ADM. In particular, we focus on the multiblock ADM with parallel splitting, which has easy decentralized implementation. Convergence rate is analyzed in the setting of average consensus, and the relation between twoblock and multiblock ADMs are studied. Numerical experiments demonstrate the effectiveness of the multiblock ADM with parallel splitting in terms of speed and communication cost and show that it has better network scalability.
Introduction
In recent years, the communication, signal processing, control, and optimization communities have witnessed considerable research efforts on decentralized optimization for networked multiagent systems [13]. A networked multiagent system, such as a wireless sensor network (WSN) or a networked control system (NCS), is composed of multiple geographically distributed but interconnected agents which have sensing, computation, communication, and actuating abilities. This system generally has limited resources for communication, since battery power is limited and recharging is difficult, while communication between two agents is energyconsuming. Furthermore, the communication link is often vulnerable and bandwidthlimited. In this situation, decentralized optimization emerges as an effective approach to improve network scalability. In decentralized optimization, data and computation are decentralized. Each agent exchanges information with its neighbors and accomplishes an otherwise centralized optimization task.
This article focuses on the decentralized consensus optimization problem. We consider a network of L agents which cooperatively optimize a separable objective function [38]:
where
Related study
The decentralized consensus optimization formulation (1) arises in many practical applications, such as averaging [911], estimation [1217], learning [1821], etc. The form of f_{i}(x) can be least squares [1113], ℓ_{1}regularized least squares [1417], or more general ones [1821]. Note that this model can be extended to account for those with separable constraints, such as the network utility maximization (NUM) problem [2224].
Existing approaches to solving (1) include: i) belief propagation based on graphical models and Markovian random fields [1820]; ii) incremental optimization which minimizes the overall objective function along a predefined path on the network [7,8]; iii) stochastic optimization with information exchange between neighboring agents [46]; and iv) optimization with explicit consensus constraints which can be handled with the alternating direction method (ADM) [3,1217]. The ADM approach is fully decentralized, does not make any assumptions on network infrastructure such as free of loop or with a predefined path, and generally has satisfactory convergence performance. In this article, we mainly discuss the application of ADMs in the decentralized consensus optimization problem.
Our research is along the line of informationdriven signal processing and control of WSNs and NCSs [2426]. Accompanied with the unprecedented data collection abilities offered by largescale networked multiagent systems, a new challenge also arises: how should we process such a large amount of data to make estimates and produce control strategies given limited network resources? Instead of processing the data in a fusion center, our solution is letting each agent autonomously make decisions aided by limited communication with its neighbors. From this perspective, each individual objective function f_{i}(x) in (1) is constructed from the data collected by agent i, and x is the global information common to all agents (e.g., estimates or control strategies) obtained based on the data collected by the whole network. Though this framework can be generalized to various signal processing and control problems, this article focuses on those can be formulated as (1). For problems such as dynamic control and Kalman filtering of networked multiagent systems, interested readers are referred to [1,2,27,28], respectively.
Our contribution
Motivated by a series of recent articles on multiblock ADMs and their convergence analysis [2931], this article describes their applications to the decentralized consensus optimization problem. The multiblock ADM with parallel spliting is reviewed in Section 3. Unlike the classical ADM (see textbooks [32,33]), this multiblock ADM splits the optimization variables into multiple blocks and sequentially updates just one of them while fixing the others. The classical ADM, on the other hand, only has two blocks of variables. Hence in this article we refer to it by the twoblock ADM. Our problem (1) does not naturally have two distinct blocks of variables, and to apply the twoblock ADM one needs to introduce extra variables (see e.g., [15,16,32]). We review this in Section 2. On the other hand, it is simpler to apply the multiblock ADM to (1) and the resulting algorithm is readily decentralized.
In this article also analyzes the convergence rate of the multiblock ADM applied
to the average consensus problem, which is a special case of (1) where
Paper organization
The rest of this article is organized as follows. Section 2 reviews a reformulation of the decentralized consensus optimization problem (1), to which the twoblock ADM is applied. Section 3 reviews the multiblock ADM and applies a parallelsplitting version of it to (1). Section 4 elaborates on the convergence rate analysis on the average consensus problem, and shows that the twoblock ADM is a special case of the multiblock ADM in this case. Section 5 presents numerical simulations of the twoblock and multiblock ADMs. Finally, Section 6 concludes the article. Appendix Appendix 1 is placed in the last section.
Problem formulation and the twoblock ADM
In this section, we describe an equivalent formulation of the decentralized consensus optimization problem (1) and outline the algorithm design based on the twoblock ADM.
Problem formulation
We consider a networked multiagent system described by an undirected connected communication
graph
Our objective is to solve (1) with only information exchange between neighbors. To
this end, define x^{(i)} as agent i’s local copy of x and impose consensus constraints x^{(i)}=x^{(j)}for all pairs of neighbors i and j. With these and given that the communication graph
The twoblock ADM
Let us consider the following convex program with separable equality constraints:
Here for i=1 and 2,
Here
and updates the Lagrange multiplier λ(t + 1) as:
The twoblock ADM guarantees global convergence for any c > 0 [32]. More precisely, when each g_{i} is convex for i = 1 and 2, the dual sequence {λ(t)} converges to an optimal dual solution of (5); if further the primal sequence {θ_{1}(t)^{T}θ_{2}(t)^{T}^{T}} is bounded, the sequence converges to an optimal primal solution of (5).
The twoblock ADM for decentralized consensus optimization
The twoblock ADM cannot be directly applied to problem (2) because its constraints interconnect all the variables pair by pair. There are no obvious two blocks. To overcome this, [32] describes a new block of auxiliary variables, and reformulates (2) as:
Here z_{ij} is an auxiliary variable attached to x^{(i)}and x^{(j)}.
Treating {x^{(i)}} and {z_{ij}} as two blocks of variables, the twoblock ADM is applied to problem (4). This technique has been adopted in [15,16] to solve the decentralized consensus optimization problem with neighboring consensus constraints. After eliminating {z_{ij}} from the iterative updates and further simplifications, the twoblock ADM for (4) is given below as algorithm TBADM.
Initialization: Each agent i initializes x^{(i)}(0)=0 and α_{i}(0)=0.
Step 1: At time t, each agent i updates its local copy x^{(i)}as:
Step 2: At time t, each agent i updates its Lagrange multiplier α_{i}as:
Step 3: Repeat Step 1 and Step 2 until convergence.
TBADM is well suited for decentralized computation since the updates require only communication between agents i and j, who are onehop neighbors. Detailed derivation of TBADM can be found in [15,16,32].
The multiblock ADM
The fact that many practical optimization problems naturally have multiple blocks of variables motivates the development of a class of multiblock ADMs, such as the one with parallel splitting [29], with predictioncorrection [30], and with Gaussian back substitution [31]. Due to the nature of the decentralized consensus optimization problem (2) and the need of parallelization, we choose the multiblock ADM with parallel splitting in [29].
The multiblock ADM with parallel splitting
Consider an equality constrained convex program which can be separated to L parts:
Here for all i,
where q is an auxiliary variable, λ is a Lagrange multiplier, and βis a positive constant. Step 2: Updating optimization variables{θ_{i}}:
where μis a positive constant. Step 3: Updating the Lagrange multiplierλ:
The multiblock ADM guarantees global convergence if the two positive constants β and μ are properly chosen. For the convergence proof and the settings of βand μ, the interested reader is referred to [29].
The multiblock ADM for decentralized consensus optimization
Applying the multiblock ADM in (2) directly gets a decentralized algorithm, and does not need to introduce a new block of auxiliary variables and eliminate them, as we have done in the twoblock ADM. We provide the algorithm to solve (2) based on the multiblock ADM with parallel splitting, denoted as MBADM. Detailed derivation of MBADM is given in Appendix Appendix 1.
Initialization: Each agent i initializes q_{i}(0)=0, x^{(i)}(0)=0, and λ_{i}(0)=0.
Step 1: At time t, each agent i updates its auxiliary variable q_{i}as:
Step 2: At time t, each agent i updates its local copy x^{(i)}as:
Step 3: At time t, each agent i updates its Lagrange multiplier λ_{i}as:
Step 4: Repeat Step 1 to Step 3 until convergence.
In each iteration, to update q_{i}(t + 1) and λ_{i}(t), agent i needs x^{(j)}(t) with the size of N×1 from all neighbors
Convergence rate analysis
Convergence rate is an significant issue for decentralized algorithms, since it directly
influences the overall communication cost. With respect to general separable convex
programs, [29,34] proves the sublinear convergence rates of
The average consensus problem gives rise to problem (2) with
Convergence rate of MBADM
In analyzing the convergence rate of MBADM for the average consensus problem, we first rewrite MBADM as a state transition equation form and then use the spectral analysis tools to provide a bound of convergence rate. Our train of thought is similar to that in [35] for the twoblock ADM.
According to the derivation in Appendix Appendix 1, we can rewrite MBADM in a state transition equation form. Let us define a state vector s_{M}(t + 1)=[x^{(1)}(t + 1),…,x^{(L)}(t + 1),x^{(1)}(t),…,x^{(L)}(t)]^{T}and the corresponding state transition equation of MBADM is:
Here the state transition matrix Φ_{M}is defined as:
with Γ_{M} being an L×L matrix whose (i,i)th entry is
Proposition 1
(convergence and convergence rate of MBADM on average consensus) The state transition equation (6) defined above has the following properties:
Property 1
The matrix Φ_{M} has an eigenvalue ρ_{M1}=1 with multiplicity 1, and its corresponding left and right eigenvectors are:
and
respectively. Note that l_{M1} and r_{M1} are chosen subject to l_{M1}r_{M1}=1.
Property 2
Define:
where ρ_{Mi} is the ith eigenvalue of Φ_{M}. If ρ_{M}<1, then the limit property of s_{M}(t) is:
Further, denoting that κ_{M} is the size of the largest Jordan block of Φ_{M}, the convergence rate is:
Proof of Property 1 is given in Appendix Appendix 1. Property 2 comes from the classical
convergence rate analysis of state transition equations. If ρ_{M1}=1 and ρ_{M}<1, then there exists a unique
Remark 1
Note that the
In Property 2, there is a condition that ρ_{M}<1. It is not necessarily for true any choices of μand β. Next we show two nontrivial special cases where the condition in Property 2 satisfy. The first special case connects MBADM with TBADM. Analysis of these two special cases as well as numerical simulations provide guidelines for parameter selection in MBADM.
Proposition 2
(two nontrivial special cases) We have ρ_{M}<1 in either one of the following two cases: Case 1: The parameters μ and β are chosen such that μ = 2β > 0; further,
Remark 2
The proof of Proposition 2 is given in Appendix Appendix 1. In case 1, we set μ=2β>0, which indeed leads to the equivalence between MBADM and TBADM, as we will show
in the next subsection. In case 2, we set μ=β>0, which brings faster convergence for the average consensus problem according to
numerical simulations (see Section 5.2). Hence we recommend to set β=τμwith a fixed ratio
Connection between MBADM and TBADM
To show the connection between MBADM and TBADM, we also write TBADM as a state transition equation form. Note that [35] considers another kind of twoblock ADM for the average consensus problem, where consensus constraints are quadratically penalized by different weights in the augmented Lagrangian function. In TBADM, the consensus constraints are quadratically penalized by the same weight c.
We define a state vector s_{T}(t + 1)=[x^{(1)}(t + 1),…,x^{(L)}(t + 1),x^{(1)}(t),…,x^{(L)}(t)]^{T}and the corresponding state transition equation, according to the derivation in Appendix 1:
Here the state transition matrix Φ_{T}is defined as:
with I_{L×L} being the L×L identity matrix, 0_{L×L} being the L×L zero matrix, Γ_{T} being an L×L matrix whose (i,i)th entry is 1 and (i,j)th entry is
Comparing the state transition equations of MBADM and TBADM, we can find that TBADM
is indeed a special case of MBADM when c=μ=2β>0. In this sense, MBADM provides more flexibility in parameter selection than TBADM.
According to our simulations in Section 5.2, setting β=τμwith
Let ρ_{Ti} be the ith eigenvalue of Φ_{T}. Apparently ρ_{T1}=1. Defining:
and denoting κ_{T} as the size of the largest Jordan block of Φ_{T}, we can prove that TBADM has a similar
Numerical Experiments
In this section, we present numerical simulations and demonstrate the performance of MBADM on the decentralized consensus optimization problems. Particularly, we are interested in how the communication cost scales to the network size.
Simulation Settings
In the numerical experiments, we consider the case that the agents cooperatively solve
a leastsquares problem. Each agent i has a measurement matrix
In the simulation, we assume that L agents are uniformly randomly deployed in a 100×100 area. All agents have a common communication range r_{C}, which is chosen such that the networked multiagent system is connected. Given r_{C}, the average node degree d can be calculated. We consider the following three scenarios: #1) L=50, M=1, N=1, {A_{i}=1}, r_{C}=30, d≃12; #2) L=50, M=10, N=5, r_{C}=30, d≃12; #3) L=200, M=10, N=5, r_{C}=15, d≃12. Scenario #1 is the average consensus test. Throughout the simulations, we set β=τμin MBADM with τ=0.9.
Convergence rate for average consensus
Under different choices of c, μ, and β, the values of ρ_{T} for TBADM and the values of ρ_{M} for MBADM with respect to scenario #1 are shown in Figure 1. For TBADM, ρ_{T}sharply reduces when c increases from 0; after a certain turning point (at c^{∗}≃0.17) which corresponds to the fastest convergence rate, ρ_{T}steadily increases. The curve of ρ_{M}for MBADM shows to be more complicated due to the existence of two parameters, μ and β. For each μ, ρ_{M} steadily reduces when βincreases from 0, then sharply goes to be larger than 1 which corresponds to divergence.
The larger μ, the wider convergence range for β; but the sideeffect is the relatively slower convergence rate. The curve of particular
interest to us is μ=c^{∗}≃0.17. In this curve, 2β=c^{∗}≃0.17 corresponds to Case 1 in Proposition 2; namely, when MBADM reduces to TBADM.
Increasing β from c^{∗}, ρ_{M} still decreases until reaching a turning point 2β=2c^{∗}≃0.34, which corresponds to Case 2 in Proposition 2. This simulation validates our analysis
in Section 5.2, as well as the proposed parameter selection rule (namely, setting
a ratio τ,
Figure 1. Curves of ρ_{T} and ρ_{M} for TBADM MBADM in scenario #1. The dash line is for TBADM and its turning point corresponds to c^{∗}≃0.17; the solid line is for MBADM with μ=2c^{∗}≃0.34; the four dot lines are for MBADM with μ=0.1, 0.3, 0.5, and 0.7.
Simulation results about the actual convergence properties are shown in Figure 2. By absolute error we denote the ℓ_{2}norm of the distance between the current solution and the centralized optimal solution. Though the convergence rates of MBADM and TBADM are at the same magnitude, MBADM shows to be slightly superior to TBADM.
Figure 2. Convergence of the decentralized consensus optimization algorithms for scenario #1. Here β = τμ with τ = 0.9.
According to the theoretical analysis in Sections 4.1 and 4.2, the estimated convergence
rates of MBADM and TBADM are
Performance Comparison
Figures 3 and 4 depict the convergence properties of the two decentralized consensus optimization algorithms for scenarios #2 and #3, respectively. The parameters μ and c are tune to be near the best ones with. Here we still have β=τμ with τ=0.9. For either the medium network in scenario #2 or the large network in scenario #3, both algorithms linearly converge to the optimal solution. Comparing the two decentralized algorithms, MBADM outperforms TBADM in each scenario regarding convergence rate.
Figure 3. Convergence of the decentralized consensus optimization algorithms for scenario #2. The parameters c and μ are tuned to near the best, and β=τμ with τ=0.9
Figure 4. Convergence of the decentralized consensus optimization algorithms for scenario #3. The parameters c and μare tuned to near the best, and β = τμwith τ = 0.9.
What of particular interest to us is whether the decentralized algorithms are scalable to network size. Observing Figure 3 with L=50 agents and d≃12, and Figure 4 with L=200 agents and d≃12, we can find that the convergence rates of the two algorithm are more dependent on the average node degree other than on the network size. These numerical experiments verify the wellrecognized claim that decentralized optimization may improve the performance of a networked multiagent system with respect to network scalability.
Communication cost
Communication cost, in terms of energy consumption and bandwidth, is the major design consideration of a resourcelimited networked multiagent system, and can be approximately evaluated by the volume of information exchange during the decentralized consensus optimization process. Ignoring the extra burden of coordinating the network, for each agent, the communication cost is proportional to the number of iterations multiplied by the volume of information exchange per iteration. Therefore, reducing the information exchange per iteration is of critical importance to the design of lightweight algorithms.
Comparing the two decentralized consensus optimization algorithms, the information exchange per iteration is decided by the communication mode of agents, namely, broadcast or unicast. In the broadcast mode, one agent can send one piece of information to all of its neighbors with one transmission; contrarily, in the unicast mode, the agent needs multiple transmissions to do so. The two modes both have their pros and cons. The broadcast mode utilizes the characteristic of wireless communication, but may brings difficulties in coordinating the network, such as avoiding collisions. Though the unicast mode consumes much more transmissions, the randomizedgossiplike scheme is very useful in communication for the sake of robustness [37]. The average volume of information exchange per iteration of the four decentralized consensus optimization algorithms are outlined in Table 1, for both the broadcast and unicast modes.
Table 1. Average volume of information exchange per iteration
In summary, the decentralized consensus optimization algorithms, no matter with the
broadcast or unicast mode, are scalable to the network size. Since the number of iterations
is proportional to the average node degree d, the overall average volume of information exchange is ∼Nd for the broadcast mode and ∼Nd^{2} for the unicast mode. As a comparison, consider a centralized networked multiagent
system uniformly randomly deployed in a twodimensional area with a fusion center
which collects measurement vectors from all agents. The average volume of information
exchange is
Conclusion
This article considers solving the decentralized consensus optimization problem with the parallel version multiblock ADM in a networked multiagent system. The traditional ADM can be used but it requires the introduction of a second block of auxiliary variables whereas our method takes advantages of the problem’s nature of having multiple blocks of variables. We analyze the rate of convergence of our method applied to the average consensus problem. Analysis results that the twoblock ADM is a special case of the multiblock ADM on average consensus. With extensive numerical experiments, we demonstrate the effectiveness of the proposed algorithm.
In the implementation of a networked multiagent system, practical issues such as packet loss, asynchronization, and quantization are inevitable. This article assumes that the communication links are reliable, the network time is slotted and well synchronized, and the exchanged information is not quantized. We would like to address these issues in future research.
Appendix 1
This section provides some theoretical results in the article.
Development of MBADM
The decentralized consensus optimization problem (2) with neighboring consensus constraints
can be rewritten as the form of (5). Apparently, g_{i}=f_{i}, θ_{i}=x^{(i)}, and e is an L^{2}N×1 zero vector. Each D_{i} is an L^{2}N×N matrix with L^{2} blocks of N×N matrices. Each block of D_{i} can be defined as follows. Consider an L×Lmatrix U^{(i)}, whose
At time t, the multiblock ADM works as follows: Step 1: Updating the auxiliary variables {q_{ij}}:
Step 2: Optimizing the local copies{x^{(i)}}:
Step 3: Updating the Lagrange multipliers {λ_{ij}}:
Note that βand μ are positive constant parameters used by the multiblock ADM.
The updating rules (8), (9), and (10) can also be further simplified. Since we often
set {λ_{ij}(0)} as 0, (8) and (10) imply that q_{ij}(t + 1)=−q_{ji}(t + 1) and λ_{ij}(t + 1)=−λ_{ji}(t + 1). Summing up the two sides of (8) and (10) and defining a new auxiliary variable
Hence (9) simplifies to:
State transition equation of MBADM
Combining the updating rules of q_{i}(t + 1) and q_{i}(t) in (11) and the updating rule of λ_{i}(t) in (12), we get:
Substituting
Combining the optimality conditions of x^{(i)}(t + 1) and x^{(i)}(t) and (14) leads to:
The initial state is
Proof of Property 1 in Proposition 1
It is straightforward to show that ρ_{M1}=1 is an eigenvalue of Φ_{M}, as well as l_{M1} and r_{M1} are its corresponding left and right eigenvectors. Next we prove that ρ_{M1} = 1 is with multiplicity 1 by contradiction. If ρ_{M1} = 1 belongs to a larger Jordan block, there exists a vector
or equivalent to:
Denote the real part of w_{k}and w_{j} as Re(w_{k}) and Re(w_{j}), respectively. Recalling that Re(w_{k})≥Re(w_{j}) and picking up the real part of (18), we have:
This leads to contradiction. Hence ρ_{M1}=1 is an eigenvalue of Φ_{M}with multiplicity 1.
Proof of Proposition 2
Denote the ith eigenvalue of Φ_{i}as ρ_{Mi}. Apparently, its eigenvectors should have the form of [ρ_{Mi}v^{T},v^{T}]^{T}where v^{T}=[v_{1},…,v_{L}]^{T} is a nonzero vector, since the lower half of Φ_{M}is [I_{L×L},0_{L×L}]. Suppose that v_{k} has the largest norm (here we use · to denote the norm of a complex number) among all elements of v. Then picking up the kth row of Φ_{M}[ρ_{Mi}v^{T},v^{T}]^{T} = ρ_{Mi}[ρ_{Mi}v^{T},v^{T}]^{T}, we have:
or equivalently:
Since v_{k}has the largest norm among all elements of v, taking norms for the both sides of (21) leads to:
Notice that the inequalities turn to equalities when and only when
Let us consider the two nontrivial special cases.
Case 1
The parameters μand βare chosen such that μ=2β>0; further,
In this case,
Define
Recall that
Case 2
The parameters μand βare chosen such that μ=β>0; further,
In this case,
Let us prove the conclusion by contradiction. Suppose that there exists a ρ_{Mi} with ρ_{Mi}≥1 satisfies (26), then:
Again, the inequalities turns to equalities only for ρ_{M1}=1. For any other eigenvalue ρ_{Mi}, we have ρ_{Mi}<1 which contradicts with ρ_{Mi}≥1. Therefore, ρ_{Mi}<1 for i≠1.
State transition equation of TBADM
Substituting
the optimality condition for x^{(i)}(t + 1) is:
Considering x^{(i)}(t), the optimality condition is correspondingly:
Combining (28) and (29) with:
the state transition equation for agent i is:
Competing interests
The authors declare that they have no competing interests.
Acknowledgements
The work of Qing Ling is supported in part by NSFC grant 61004137 and Fundamental Research Funds for the Central Universities. The work of Wotao Yin is supported in part by ARL and ARO grant W911NF0910383 and NSF grants DMS0748839 and ECCS1028790. The work of Xiaoming Yuan is supported in part by the General Research Fund No. 203311 from Hong Kong Research Grants Council.
References

W Ren, R Beard, E Atkins, Information consensus in multivehicle cooperative control: collective group behavier through local interaction. IEEE Control Systs. Mag 27, 71–82 (2007)

R OlfatiSaber, Kalmanconsensus filter: optimality, stability, and performance. Proceedings of CDC (Shanghai, China, 2009), pp. 7036–7042

S Boyd, N Parikh, E Chu, B Peleato, J Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundation Trends Mach. Learn 3, 1–122 (2010). Publisher Full Text

J Tsitsiklis, Problems in decentralized decision making and computation (MIT, Ph.D Thesis, 1984)

A Nedic, A Ozdaglar, Distributed subgradient methods for multiagent optimization. IEEE Trans. Autom. Control 54, 48–61 (2009)

K Srivastava, A Nedic, Distrbited asynchronous constrained stochastic optimization. IEEE J. Sel. Topics Signal Process 5, 772–790 (2011)

M Rabbat, R Nowak, Distributed optimization in sensor networks. Proceedings of IPSN (Berkeley, USA, 2004), pp. 20–27

M Rabbat, R Nowak, Quantized incremental algorithms for distributed optimization. IEEE J. Sel. Areas Commun 23, 798–808 (2006)

L Xiao, S Boyd, S Kim, Distributed average consensus with leastmeansquare deviation. J. Parallel Distrib. Comput 67, 33–46 (2007). Publisher Full Text

S Kar, J Moura, Distributed consensus algorithms in sensor networks: quantized data and random link failures. IEEE Trans. Signal Process 58, 1383–1400 (2010)

A Olshevsky, Efficient Information Aggregation Strategies for Distributed Control and Signal Processing (Ph.D Thesis, MIT, 2010)

I Schizas, A Ribeiro, G Giannakis, Consensus in ad hoc WSNs with noisy links  Part I: distributed estimation of deterministic signals. IEEE Trans. Signal Process 56, 350–364 (2008)

G Mateos, I Schizas, G Giannakis, Distributed recursive leastsquares for consensusbased innetwork adaptive estimation. IEEE Trans. Signal Process 57, 4583–4588 (2009)

Q Ling, Z Tian, Decentralized sparse signal recovery for compressive sleeping wireless sensor networks. IEEE Trans. Signal Process 58, 3816–3827 (2010)

J Bazerque, G Giannakis, Distributed spectrum sensing for cognitive radio networks by exploiting sparsity. IEEE Trans. Signal Process 58, 1847–1862 (2010)

G Mateos, J Bazerque, G Giannakis, Distributed sparse linear regression. IEEE Trans. Signal Process 58, 5262–5276 (2010)

D Jakovetic, J Xavier, J Moura, Cooperative convex optimization in networked systems: augmented Lagrangian algorithms with direct gossip communication. IEEE Trans. Signal Process 59, 3889–3902 (2011)

M Cetin, L Chen, J Fisher I I I, A Ihler, R Moss, M Wainwright, A Willsky, Distributed fusion in sensor networks. IEEE Signal Process. Mag 23, 42–55 (2006)

J Predd, S Kulkarni, V Poor, Distributed learning in wireless sensor networks. IEEE Signal Process. Mag 24, 56–69 (2007)

J Predd, S Kulkarni, H Poor, A collaborative training algorithm for distributed learning. IEEE Trans. Inf. Theory 55, 1856–1871 (2009)

U Khan, S Kar, J Moura, Higher dimensional consensus: learning in largescale networks. IEEE Trans. Signal Process 58, 2836–2849 (2010)

A Jadbabaie, A Ozdaglar, M Zargham, A distributed Newton method for network optimization. Proceedings of CDC (Shanghai, China, 2009), pp. 2736–2741

J Koshal, A Nedic, U Shanbhag, Multiuser optimization: distributed algorithms and error analysis. SIAM J. Optimiz 21, 1046–1081 (2011). Publisher Full Text

P Wan, M Lemmon, Distributed network utility maximization using eventtriggered augmented Lagrangian methods. Proceedings of ACC (St. Louis, USA, 2009), pp. 3298–3303

F Zhao, J Shin, J Reich, Informationdriven dynamic sensor collaboration. IEEE Signal Process. Mag 19, 61–72 (2002). Publisher Full Text

F Zhao, L Guibas, Wireless Sensor Networks: an Information Processing Approach (Morgan Kaufmann, Burlington, USA, 2004)

I Schizas, G Giannakis, S Roumeliotis, A Ribeiro, Consensus in ad hoc WSNs with noisy links – part II: distributed estimation and smoothing of random signals. IEEE Trans. Signal Process 56(4), 1650–1666 (2008)

A Ribeiro, I Schizas, S Roumeliotis, G Giannakis, Kalman filtering in wireless sensor networks: reducing communication cost in state estimation problems. IEEE Control Systs. Mag 30, 66–86 (2010)

M Tao, Some parallel splitting methods for separable convex programming with O(1/t) convergence rate in press

B He, M Tao, M Xu, X Yuan, Alternating directions based contraction method for generally separable linearly constrained convex programming problems in press

B He, M Tao, X Yuan, Alternating direction method with Gaussian back substitution for separable convex programming. SIAM J. Optim 22, 313–340 (2012). Publisher Full Text

D Bertsekas, J Tsitsiklis, Parallel and Distributed Computation: Numerical Methods (Athena Scientific, Nashua, USA, 1997)

D Bertsekas, Numerical Optimization (Athena Scientific, Nashua, USA, 1999)

B He, X Yuan, On the O(1/n) convergence rate of DouglasRachford alternating direction method. SIAM J. Num. Anal 50, 700–709 (2012). Publisher Full Text

T Erseghe, D Zennaro, E Dall’Anese, L Vangelista, Fast consensus by the alternating direction multipliers method. IEEE Trans. Signal Process 59, 5523–5537 (2011)

J Rosenthal, Convergence rates for Markov chains. SIAM Rev 37, 387–405 (1995). Publisher Full Text

S Boyd, A Ghosh, B Prabhakar, D Shah, Randomized gossip algorithms. IEEE Trans. Inf. Theory 52, 2508–2530 (2006)